revision and text now use separate row ID numbers in HEAD. A revision row refers to text.old_id with its rev_text_id key; this allows text revisions to be stored independently of a given revision.
* Operations that change only metadata can be put in the page history without storing a new text record. I've done this for page move as a start. (It might be good to also add a marker field for metadata-only changes so they can be shown distinctly in the history.)
* In theory, reverts could do the same, referring to the prior text record without saving a new copy.
* The storage backend can number text objects using its own scheme; if necessary text object IDs can be reassigned during batch recompression.
If you're running a 1.5 test wiki, you'll have to run the update.php to add the field. (Manually, maintenance/archive/patch-rev_text_id.sql)
-- brion vibber (brion @ pobox.com)
On Mar 28, 2005 12:08 PM, Brion Vibber brion@pobox.com wrote:
revision and text now use separate row ID numbers in HEAD. A revision row refers to text.old_id with its rev_text_id key; this allows text revisions to be stored independently of a given revision.
It occurred to me when I saw this that this means that displaying the current content of a page requires reading data from all three tables: page_current refers to a rev_id, and rev_text_id then refers to the 'old_id' in the text table. This seems inefficient somehow, since I wouldn't expect anything from the revision table is actually needed during normal page viewing (it being entirely history metadata). Moreover, it looks to me like displaying the page's history doesn't refer to page_latest at all, so would it be possible and appropriate to make page_latest point directly to the 'text', rather than the 'revision'?
If I've missed some obvious reason why this is poppycock, I apologise for wasting your time. :)
Rowan Collins wrote:
It occurred to me when I saw this that this means that displaying the current content of a page requires reading data from all three tables: page_current refers to a rev_id, and rev_text_id then refers to the 'old_id' in the text table. This seems inefficient somehow, since I wouldn't expect anything from the revision table is actually needed during normal page viewing (it being entirely history metadata).
Currently the last modified date is used in the page footer, which needs to be pulled from the revision record. If we really wanted, it might be possible to pull in a copy of the timestamp to the page record, but I'm not sure it would be worthwhile.
Moreover, it looks to me like displaying the page's history doesn't refer to page_latest at all, so would it be possible and appropriate to make page_latest point directly to the 'text', rather than the 'revision'?
That would make it very hard to track the current revision in general, since we wouldn't know for sure which revision ID was the current one (the revision -> text mapping is now one-to-many to allow storing revision records for metadata changes without wastefully duplicating text storage).
However often we shouldn't need to load the text at all; pre-rendered HTML will be pulled from the parser cache. Since things are still being rewritten to make best use of the new schema it's possible that right _now_ it's wastefully loading the text anyway, but it doesn't need to do so.
-- brion vibber (brion @ pobox.com)
On Apr 2, 2005 10:11 PM, Brion Vibber brion@pobox.com wrote:
That would make it very hard to track the current revision in general, since we wouldn't know for sure which revision ID was the current one (the revision -> text mapping is now one-to-many to allow storing revision records for metadata changes without wastefully duplicating text storage).
But is this used? For reverts done by admins? For "manual" reverts done by users? (Edited an old version, made no changes to it)
If not, are there plans to use it?
Tomer Chachamu wrote:
On Apr 2, 2005 10:11 PM, Brion Vibber brion@pobox.com wrote:
That would make it very hard to track the current revision in general, since we wouldn't know for sure which revision ID was the current one (the revision -> text mapping is now one-to-many to allow storing revision records for metadata changes without wastefully duplicating text storage).
But is this used?
At present it's used for page moves.
For reverts done by admins? For "manual" reverts done by users? (Edited an old version, made no changes to it)
Todo.
-- brion vibber (brion @ pobox.com)
wikitech-l@lists.wikimedia.org