On 2/15/06, Brion Vibber <brion(a)pobox.com> wrote:
1) Check md5sum.
I did indeed have a bad bz2 file.
3) Try the .7z version
When I used 7z e -so , it added some garbage to the output before the
<media..., and there doesn't appear to be a way to silence it (crazy,
I know).
Example:
"
en$ less 7z-junk
7-Zip (A) 4.33 beta Copyright (c) 1999-2006 Igor Pavlov 2006-02-05
p7zip Version 4.33 (locale=en_US.UTF-8,Utf16=on,HugeFiles=on)
Processing archive: enwiki-20060125-pages-meta-history.xml.7z
Extracting enwiki-20060125-pages-meta-history.xml<mediawiki xm
"
If you've successfully extracted 7z to stdout without that garbage,
I'd love to know the invocation.
4) Tell me your exact OS and Java VM version.
Ubuntu Hoary, x86 32 bit:
$ uname -a
Linux ... i686 GNU/Linux
en$ java -version
java version "1.5.0_05"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.5.0_05-b05)
Java HotSpot(TM) Client VM (build 1.5.0_05-b05, mixed mode, sharing)
But I'm re-downloading the bz2, and will report back.
(I've previously checked the full uncompressed copy
of
that dump for schema validity, it's fine.)
Thanks for your assistance.
Jeremy