On Mon, 12 Sep 2005 22:51:28 +0200, Tomasz Wegrzanowski wrote:
Maybe download a dump and try with xdelta ?
Looks promising. The only drawback that I can see is that it stores an
md5 sum which for very small changes can make it less space efficient than
ordinary diff in rcs format and is just plain unnecessary for mediawiki.
I'll see if there's a way to disable the md5 sum; perhaps the source will
need to be hacked.
Now I have a David and Goliath problem... 56k dial-up vs 31G xml download.
Can anyone suggest a source for a smaller data set in English with some
representative multiple-revision articles, preferably a few edit wars etc.
--
http://members.dodo.com.au/~netocrat