On Wed, 02 Mar 2005 16:48:51 -0800, Brion Vibber brion@pobox.com wrote:
Lemme summarize the situation:
We have four link tables currently: links, brokenlinks, imagelinks, and categorylinks.
links is from id->target id brokenlinks is from id->(text target namespace+title) imagelinks is from id->target title [namespace is 6 by definition] categorylinks is from id->target title [namespace is 14 by definition]
In all, the 'from' is a key on page_id (cur_id in old schema) which uniquely identifies the page doing the linking. This number persists across page renaming.
In imagelinks and categorylinks, the target title can be used in conjunction with the hardcoded namespace to join to page/cur for the target.
In brokenlinks, the target is ugly ugly text. This can't be used in any joins. It should be changed to (namespace,title) but we are too lazy and this hasn't been done yet. There is the additional problem that the size limit of the field isn't 100% correct so there might be inconsistencies with long titles.
In links, the target is a page_id/cur_id number, and can be used to do joins. BUT, since linking is done by *name*, not by number, a creation/deletion/renaming of the target page will break this entry. Thus we have to clean up links, and shuffle pages around between links and brokenlinks when these things happen.
This kind of updating can be a burden on the database during operations on heavily-linked pages, so it's something we scalability-conscious folk want to eliminate.
Thanks for the clarity, Brion. If I understand things correctly, we should look at eventually changing 'links' and 'broken links' to use: from id->(to namespace, to title).
Would there be a need for separate tables then? Such a configuration would require a lookup on the page table for each link when rendering a page. Although, now that I think about it, that's probably required now anyway.
Categorylinks seems to be a bit of a special case, since it really amounts to a reverse link. Adding [[category:foo]] to page 'bar' effectively adds a link to 'bar' on category:foo. At the same time, there can be broken category links on a page, but not broken page links on the category listing.
At the moment, I'm thinking of adding a "fromNamespace" field to categorylinks. This will decouple the display of pages from the display of parent categories, and would facilitate breaking the category display into namespaces.
-Rich Holton en.wikipedia:User:Rholton