Thanks for sharing!
Shani.
On Mon, 23 Jul 2018 23:38 Pine W, <wiki.pine(a)gmail.com> wrote:
Forwarding in case this is of interest to anyone on
the Analytics or
Research lists who doesn't subscribe to Wikitech-l or Xmldatadumps-l.
Pine
(
https://meta.wikimedia.org/wiki/User:Pine )
---------- Forwarded message ----------
From: Ariel Glenn WMF <ariel(a)wikimedia.org>
Date: Fri, Jul 20, 2018 at 5:53 AM
Subject: [Wikitech-l] hewiki dump to be added to 'big wikis' and run with
multiple processes
To: Wikipedia Xmldatadumps-l <Xmldatadumps-l(a)lists.wikimedia.org>rg>,
Wikimedia developers <wikitech-l(a)lists.wikimedia.org>
Good morning!
The pages-meta-history dumps for hewiki take 70 hours these days, the
longest of any wiki not already running with parallel jobs. I plan to add
it to the list of 'big wikis' starting August 1st, meaning that 6 jobs will
run in parallel producing the usual numbered file output; look at e.g.
frwiki dumps for an example.
Please adjust any download/processing scripts accordingly.
Thanks!
Ariel
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l