I've gone ahead and made another change to the schema which I'd
originally passed over.
The links and brokenlinks tables are now merged to a single pagelinks
table, which records the namespace+title key pair of target links rather
than the page ID or the prefixed title.
While I've been eyeing this for a while to simplify things, doing it now
is mainly a response to the scalability problems of renaming and
deletion of widely-linked pages. These actions required updating all
linking records, hundreds of thousands in extreme cases, to maintain
consistency and for instance are a significant factor in the
unpleasantness of dealing with page-move vandalism.
(This issue is similar to but separate from the issue of title updates
to all 'old' records for renaming often-edited pages, which was dealt
with by the page/revision split.)
It may be necessary to do some shakedown testing to make sure I haven't
introduced fun new bugs, but I figured better to do it now than have to
wait until the next major release. The update.php script should convert
the existing tables automatically (it will leave them in place for now...)
At some point we should also introduce the ability to run page_touched
and squid purge updates in the background, by handing the target page to
a purge daemon. This won't require database changes, though.
-- brion vibber (brion @
pobox.com)