Rowan Collins wrote:
It occurred to me when I saw this that this means that displaying the current content of a page requires reading data from all three tables: page_current refers to a rev_id, and rev_text_id then refers to the 'old_id' in the text table. This seems inefficient somehow, since I wouldn't expect anything from the revision table is actually needed during normal page viewing (it being entirely history metadata).
Currently the last modified date is used in the page footer, which needs to be pulled from the revision record. If we really wanted, it might be possible to pull in a copy of the timestamp to the page record, but I'm not sure it would be worthwhile.
Moreover, it looks to me like displaying the page's history doesn't refer to page_latest at all, so would it be possible and appropriate to make page_latest point directly to the 'text', rather than the 'revision'?
That would make it very hard to track the current revision in general, since we wouldn't know for sure which revision ID was the current one (the revision -> text mapping is now one-to-many to allow storing revision records for metadata changes without wastefully duplicating text storage).
However often we shouldn't need to load the text at all; pre-rendered HTML will be pulled from the parser cache. Since things are still being rewritten to make best use of the new schema it's possible that right _now_ it's wastefully loading the text anyway, but it doesn't need to do so.
-- brion vibber (brion @ pobox.com)