Did anybody else keep a backup of the English Wikipedia database from
when it ran on UseMod (Jan 2001 - Jan 2002)?
I seem to recall keeping a copy, but can't find it now. It's primarily
just for historical interest, but there is also one practical issue
which anyone may or may not care about:
When the UseMod database was originally converted to Magnus's new wiki
software, only the latest revision text of each page was copied into
the new database, and it was attributed to "Conversion script". Some
months later I wrote a script that did another pass through the data,
correcting for some renaming that had been done in the meantime, to
copy in the previous page histories with attribution. But the actual
current revision at the time of conversion was not changed, and was
left attributed to 'Conversion script'.
If we ever hope to fix those up, we'll need the pre-conversion database.
It should be in a tarball, something on the order of 50-150 megabytes,
containing 'lib-http' and 'work-http' subdirectories. Somewhere under
'lib-http' is a 'page' subdirectory with lots and lots of files split
up by letter -- that's the guts of it. The tarball was probably but not
necessarily named something like 'wikipedia-usemod.tgz' or
'wiki-fromusemod.tgz' or similar.
We don't need it for other languages (they were converted with a later
script that included histories intact, and I still have a bunch of
those tarballs anyway), just for English.
-- brion vibber (brion @
pobox.com)