On 2/15/06, Brion Vibber <brion(a)pobox.com> wrote:
1) Check md5sum.
I did indeed have a bad bz2 file.
3) Try the .7z version
When I used 7z e -so , it added some garbage to the output before the
<media..., and there doesn't appear to be a way to silence it (crazy,
en$ less 7z-junk
7-Zip (A) 4.33 beta Copyright (c) 1999-2006 Igor Pavlov 2006-02-05
p7zip Version 4.33 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on)
Processing archive: enwiki-20060125-pages-meta-history.xml.7z
Extracting enwiki-20060125-pages-meta-history.xml<mediawiki xm
If you've successfully extracted 7z to stdout without that garbage,
I'd love to know the invocation.
4) Tell me your exact OS and Java VM version.
Ubuntu Hoary, x86 32 bit:
$ uname -a
Linux ... i686 GNU/Linux
en$ java -version
java version "1.5.0_05"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_05-b05)
Java HotSpot(TM) Client VM (build 1.5.0_05-b05, mixed mode, sharing)
But I'm re-downloading the bz2, and will report back.
(I've previously checked the full uncompressed copy
that dump for schema validity, it's fine.)
Thanks for your assistance.