Got an md5sum?
On Mon, Mar 29, 2010 at 5:46 PM, Tomasz Finc tfinc@wikimedia.org wrote:
I love lzma compression.
enwiki-20100130-pages-meta-history.xml.bz2 280.3 GB
enwiki-20100130-pages-meta-history.xml.7z 31.9 GB
Download at http://tinyurl.com/yeelbse
Enjoy!
--tomasz
Tomasz Finc wrote:
Tomasz Finc wrote:
New full history en wiki snapshot is hot off the presses!
It's currently being checksummed which will take a while for 280GB+ of compressed data but for those brave souls willing to test please grab it from
http://download.wikipedia.org/enwiki/20100130/enwiki-20100130-pages-meta-his...
and give us feedback about its quality. This run took just over a month and gained a huge speed up after Tims work on re-compressing ES. If we see no hiccups with this data snapshot, I'll start mirroring it to other locations (internet archive, amazon public data sets, etc).
For those not familiar, the last successful run that we've seen of this data goes all the way back to 2008-10-03. That's over 1.5 years of people waiting to get access to these data bits.
I'm excited to say that we seem to have it :)
--tomasz
We now have an md5sum for enwiki-20100130-pages-meta-history.xml.bz2.
"65677bc275442c7579857cc26b355ded"
Please verify against it before filing issues.
--tomasz
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Xmldatadumps-admin-l mailing list Xmldatadumps-admin-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-admin-l