Brion Vibber wrote:
While it's not any official or de-facto standard that we know of, the code is open source (LGPL, CPL) and a basic command-line archiver is available for most Unix-like platforms as well as Windows so it should be free to use (in the absence of surprise patents): http://www.7-zip.org/sdk.html
I've had good experiences with 7-zip under Windows; didn't know there was a *nix tool, which kept me from using it more often.
I'm probably going to try to work LZMA compression into the dump process to supplement the gzipped files; and/or we could switch from gzip back to bzip2, which provides a still respectable improvement in compression and is a bit more standard.
Which reminds me: Why do people have to download the whole "XML'd" database every time they want to update? There should be a way to make smaller packages (like "pre-2003" or "2005-01") for all revisions created in that timeframe. These could then be patched together and updated with the latest package for those who need the old revisions at home :-)
Taking this idea further, the database seems to hold up fine right now, but with expotential growth come lots'o'revisions. At some point, we might want to add a "rev_on_disk" field to the revisions table, and move the text of revisions older than, say, 3 month to the file system (file name generated through article and revision ID). That would save lots of space in the database, and not interfere with important ongoing operations like revert wars :-) and still keep the "really old" versions accessible.
Not making much sense, am I? I need more coffee...
Magnus