Jeremy Dunck wrote
I downloaded en_20050713_pages_full.xml.gz but upon decompressing, apparently ran out of disk space.
gunzip -l doesn't appear to show the correct uncompressed size.
jgd011100@utd51523:~/wps/data$ gunzip -l en_20050713_pages_full.xml.gz compressed uncompressed ratio uncompressed_name 32171234059 3560916631 -803.5% en_20050713_pages_full.xml
Can someone give me a ballpark size for this file (uncompressed) so I know how to deal with it?
I argue that you have a versin of gzip that is not good to deal with so large file.
Try to install a new version of gzip.
AnyFile
On 10/8/05, Any File anyfile@mail.com wrote:
I argue that you have a versin of gzip that is not good to deal with so large file.
Try to install a new version of gzip.
It seems to be the most recent version, unless I've missed something.
jgd011100@utd51523:~/wps/data$ gunzip -V gunzip 1.3.5 (2002-09-30) ....
Regardless, that's not my question. :)
Can someone give me a ballpark size for this file (uncompressed) so I know how to deal with it?
On 10/8/05, Jeremy Dunck jdunck@gmail.com wrote:
Can someone give me a ballpark size for this file (uncompressed) so I know how to deal with it?
In http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/19674 , Brion found 4.74x compression for similar data with gzip.
32171234059 * 4.74 = 152.49g
Hope this helps!
On 10/8/05, Evan Martin evanm@google.com wrote:
In http://article.gmane.org/gmane.science.linguistics.wikipedia.technical/19674 , Brion found 4.74x compression for similar data with gzip.
32171234059 * 4.74 = 152.49g
Yes, thanks very much.
wikitech-l@lists.wikimedia.org