On Mon, 12 Sep 2005 01:56:53 -0700, Brion Vibber wrote:
Netocrat wrote:
What is/are the reason/s for storing the full
text of page revisions in
the database as opposed to some form of differential?
Expedience; it hasn't been written yet.
Am I correct in
assuming that speed has been given priority over storage space
requirements, and if so, has any benchmarking been done to find out how
much overhead would be added by storing revision as diffs and how much
space would be saved?
See Tim's presentation from 21C3:
http://zwinger.wikimedia.org/berlin/
That's exactly the sort of info I was looking for. Was any attempt made
to compress the diffs? I would be interested to know how the result
compared for compression and overall speed to the compressed concatenated
revisions.
The three main reasons to find an improvement to rcs diffs were stated as:
* moved paragraphs
* reverted edits
* minor changes within a line
The 1st and 3rd could be handled by a customised diff format and the 2nd
could be handled by links in the database - have those possibilities been
considered and what pros/cons are there to this approach vs the current
compression scheme?
The disadvantage to the current compression scheme seems to me to be that
the wiki software must work on the full text of a set of revisions at a
time (i.e. when uncompressed).
Also, has
there been any discussion of the possibility of branching a
page (as is possible in e.g. a CVS repository)?
Not really. Tagging of revisions is likely to happen soonish, branching
not so likely.
Being able to specify a particular revision in a link would be useful - I
presume that's why tagging is being considered.
--
http://members.dodo.com.au/~netocrat