Forwarding in case this is of interest to anyone on the Analytics or Research lists who doesn't subscribe to Wikitech-l or Xmldatadumps-l.

---------- Forwarded message ----------
From: Ariel Glenn WMF <>
Date: Fri, Jul 20, 2018 at 5:53 AM
Subject: [Wikitech-l] hewiki dump to be added to 'big wikis' and run with multiple processes
To: Wikipedia Xmldatadumps-l <>, Wikimedia developers <>

Good morning!

The pages-meta-history dumps for hewiki take 70 hours these days, the
longest of any wiki not already running with parallel jobs. I plan to add
it to the list of 'big wikis' starting August 1st, meaning that 6 jobs will
run in parallel producing the usual numbered file output; look at e.g.
frwiki dumps for an example.

Please adjust any download/processing scripts accordingly.


Wikitech-l mailing list