I've been downloading this file (using wget on ubuntu or fetch on FreeBSD) with no issues for years. The current one is 6.2GB as it should be.
On Thu, Dec 16, 2010 at 5:53 PM, emijrp emijrp@gmail.com wrote:
If the md5s don't match, the files are obviously different, I mean, one of them is corrupt.
What is the size of your local file? I use to download dumps with wget UNIX command and I don't get errors. If you are using FAT32, the file size is limited to 2 GB and the file is truncated. Is your case?
2010/12/16 Gabriel Weinberg yegg@alum.mit.edu
md5sum doesn't match. I get e74170eaaedc65e02249e1a54b1087cb (as opposed to 7a4805475bba1599933b3acd5150bd4d on
http://download.wikimedia.org/enwiki/20101011/enwiki-20101011-md5sums.txt
).
I've downloaded it twice now and have gotten the same md5sum. Can anyone else confirm?
On Thu, Dec 16, 2010 at 5:41 PM, emijrp emijrp@gmail.com wrote:
Have you checked the md5sum?
2010/12/16 Gabriel Weinberg yegg@alum.mit.edu
Ariel T. Glenn <ariel <at> wikimedia.org> writes:
We now have a copy of the dumps on a backup host. Although we are
still
resolving hardware issues on the XML dumps server, we think it is
safe
enough to serve the existing dumps read-only. DNS was updated to
that
effect already; people should see the dumps within the hour.
Ariel
Hi, thank you for working so hard on this issue, but I'm still having trouble with the latest en.wikipedia dump, however. I downloaded http://download.wikimedia.org/enwiki/20101011/enwiki-20101011-pages- articles.xml.bz2 and am running into trouble decompressing.
In particular, bzip2 -d enwiki-20101011-pages-articles.xml.bz2 fails.
And bzip2 -tvv enwiki-20101011-pages-articles.xml.bz2 reports:
[2752: huff+mtf data integrity (CRC) error in data
I ran bzip2recover & then bzip2 -t rec* and got the following:
bzip2: rec02752enwiki-20101011-pages-articles.xml.bz2: data integrity
(CRC)
error in data bzip2: rec08881enwiki-20101011-pages-articles.xml.bz2: data integrity
(CRC)
error in data bzip2: rec26198enwiki-20101011-pages-articles.xml.bz2: data integrity
(CRC)
error in data
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l