[Xmldatadumps-l] [Xmldatadumps-admin-l] [Wikitech-l] 2010-03-11 01:10:08: enwiki Checksumming pages-meta-history.xml.bz2 :D
Brian J Mingus
Brian.Mingus at Colorado.EDU
Thu Mar 11 05:48:35 UTC 2010
On Wed, Mar 10, 2010 at 10:43 PM, Tomasz Finc <tfinc at wikimedia.org> wrote:
> Brian J Mingus wrote:
>
>>
>> On Wed, Mar 10, 2010 at 8:54 PM, Tomasz Finc <tfinc at wikimedia.org<mailto:
>> tfinc at wikimedia.org>> wrote:
>>
>> Yup, that's the one. If you have a fast upload pipe then I'm more then
>> happy to setup space for it. Otherwise it should be arriving in our
>> snail mail after a couple of days.
>>
>> -tomasz
>>
>>
>> Anyone may download the file from me here:
>>
>> http://grey.colorado.edu/enwiki-20080103-pages-meta-history.xml.7z
>>
>> The md5sum is:
>>
>> 20a201afc05a4e5f2f6c3b9b7afa225c
>> enwiki-20080103-pages-meta-history.xml.7z
>>
>> The file size is:
>>
>> 18522193111 (~18 gigabytes)
>>
>> I'm sure you will find my pipe fat enough..;-)
>>
>>
>> ------------------------------------------------------------------------
>>
>>
>> _______________________________________________
>> Xmldatadumps-admin-l mailing list
>> Xmldatadumps-admin-l at lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-admin-l
>>
>
> That seem way too tiny to be the real thing.
>
> --tomasz
>
7zip has a very impressive compression ratio. From download.wikimedia.org:
- These dumps can be *very* large, uncompressing up to 100 times the
archive download size. Suitable for archival and statistical use, most
mirror sites won't want or need this.
That notice has not changed since I downloaded this file.. the uncompressed
size could be well over a terabyte. I'm not sure how long it will take to
unpack but I have just started it. I wonder what drives your intuition?
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.wikimedia.org/pipermail/xmldatadumps-l/attachments/20100310/3faa10ac/attachment.htm
More information about the Xmldatadumps-l
mailing list