I've checked into the head branch some changes to the link tables. They now all use a key on cur_id for the *_from column instead of strings, and have a unique index to force prevent any duplicate entries. There's not yet a clean step in the update script, so just clear out your links tables (patch-linktables.sql) and rebuild them with refreshLinks.php.
This saves trouble in a number of places where we can now do joins with the link tables to get other info (such as cur_is_redirect!) as well as the name, and fewer bits need to be juggled on page renaming, as outgoing links no longer have to be changed (cur_id remains the same when a page is renamed).
rebuildLinks.inc and some of the tools in the 'maintenance page' still need to be updated to work with the new setup. (Special:Maintenance needs a *lot* of cleanup in general. It's kind of a catch-all of vaguely defined features which suck performance like a hydroelectric dam.)
Also I've slipped in some extra debug code. And, I think 'indexes.sql' is a big waste of time and should all be moved into tables.sql. Building indexes separately doesn't help on InnoDB and won't do anything on MyISAM either if you're just going to replace the table after it's built with an imported one from a dump which creates it with indexes.
Note that one of the driving forces behind schema changes here is size & number of rows to change. We've had some troubles where someone tries to rename a page with a _very_ long edit history and the wiki gets a little lost doing the updates. Changing a username and reassigning the marked edits can be similarly problematic when a lot of edits have been made. Ideally, such operations shouldn't be too 'big'... A rename shouldn't have to touch potentially thousands of old_title fields, when we can change just one and let the unchanging numeric page id link the pages to it.
It might actually be a good idea to merge links and brokenlinks into a single links table that looks like this: l_from -> key to cur_id l_to_ns, l_to_title -> key to cur_namespace, cur_title
This would avoid any need to alter the links table on page rename, creation, or deletion: outgoing links are fixed to the page id, and incoming links are fixed to the page name. It's late and I can't think right now so I'm not sure if this would interfere terribly with operations that need to treat live and broken links differently; it would require a join to cur and a check for existence or null. For page rendering duties we can cache that lookup data in linkscc as we do now, of course.
-- brion vibber (brion @ pobox.com)
On Mar 11, 2004, at 06:11, Timwi wrote:
Brion Vibber wrote:
I've checked into the head branch some changes to the link tables.
Thanks!
Could you also let us know when it goes live on the site? :-)
Not for a few weeks at least. Big changes still coming...
-- brion vibber (brion @ pobox.com)
wikitech-l@lists.wikimedia.org