On 16/01/16 02:44, Platonides wrote:
On 16/01/16 02:30, Richard Farmbrough wrote:
I have problems bunzip2ing pages-articles files. WinRAR fails at 37G, and bunzip2 fails at some point >> 14g though it "helpfully" cleans up after itself.
Bunzip2 v 1.0.6
bunzip2 enwiki-20151201-pages-articles.xml.bz2
bunzip2: I/O or other error, bailing out. Possible reason follows. bunzip2: Permission denied
Input file = enwiki-20151201-pages-articles.xml.bz2, output file = enwiki-20151201-pages-articles.xml
bunzip2: Deleting output file enwiki-20151201-pages-articles.xml, if it exists.
Any better tool?
(...) I'll check if that file uncompresses for me.
I downloaded the file enwiki-20160113-pages-articles.xml.bz2 (see hash below), and it decompressed without errors:
$ time sha256sum enwiki-20160113-pages-articles.xml.bz2 560537c3c41397856c7108287de2a8f917ad8ee2586d1d5e43a0edd4c5bc28d5 enwiki-20160113-pages-articles.xml.bz2
$ time bzip2 -kd enwiki-20160113-pages-articles.xml.bz2
real 26m48.517s user 23m5.264s sys 0m33.960s
$ tail enwiki-20160113-pages-articles.xml;echo
[[:Category:Articles created via the Article Wizard]]
http://www.npg.org.uk/collections/search/use-this-image.php?mkey=mw87648
http://www.npg.org.uk/collections/search/use-this-image.php?email=Jonathanai...</text> <sha1>ejicf3kiesjaya50u6fwwziz5incohv</sha1> </revision> </page> </mediawiki>
I can provide partial hashes, or a rdiff(1) patch for whatever file you ended up with, so you don't have to redownload the full 12G again.
Best regards