Brion Vibber wrote:
While it's not any official or de-facto standard
that we know of, the
code is open source (LGPL, CPL) and a basic command-line archiver is
available for most Unix-like platforms as well as Windows so it should
be free to use (in the absence of surprise patents):
I've had good experiences with 7-zip under Windows; didn't know there
was a *nix tool, which kept me from using it more often.
I'm probably going to try to work LZMA compression into the dump process
to supplement the gzipped files; and/or we could switch from gzip back
to bzip2, which provides a still respectable improvement in compression
and is a bit more standard.
Which reminds me: Why do people have to download the whole "XML'd"
database every time they want to update? There should be a way to make
smaller packages (like "pre-2003" or "2005-01") for all revisions
created in that timeframe. These could then be patched together and
updated with the latest package for those who need the old revisions at
Taking this idea further, the database seems to hold up fine right now,
but with expotential growth come lots'o'revisions. At some point, we
might want to add a "rev_on_disk" field to the revisions table, and move
the text of revisions older than, say, 3 month to the file system (file
name generated through article and revision ID). That would save lots of
space in the database, and not interfere with important ongoing
operations like revert wars :-) and still keep the "really old" versions
Not making much sense, am I? I need more coffee...