----- Original Message ----- From: "Ariel T. Glenn" ariel@wikimedia.org Date: Thursday, September 8, 2011 12:22 am Subject: [Xmldatadumps-l] another month, another dump. ho hum :-P To: xmldatadumps-l@lists.wikimedia.org
The September en wikipedia dumps are done. Folks who use them, note that this is the first run with the generation of a pile of smaller files. The naming scheme as you will have noticed has an additionalstring: -p<first-page-id-contained>p<last-pageid- contained> Expect the specific groupings to change from one run to the next; it's time- based,rather than based on the number of pages or revisions.
You may notice a gap of a few numbers between files; this would indicatethat those pages were deleted and not included in the dump at all.
Since there were no issues with the network, database servers, broken MW deployments etc., the run finished without any need for restarts of a particular step; this is probably the fastest we'll ever see it run, in a little under 8 days.
Any issues, please let me know. I expect people will need a script to download these files easily; didn't someone on this list have a tool in the works?
Hi Ariel,
This download addon for firefox works quite well, and is cross-platform:
http://en.wikipedia.org/wiki/DownThemAll! https://addons.mozilla.org/en-US/firefox/addon/downthemall/ http://www.downthemall.net/
cheers, Jamie
Ariel
Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l