Dumps consumers:
Forwarding this note from the xmldatadumps-l(a)lists.wikimedia.org list
about an upcoming change to the number of files generated for some
wikis (for example dewiki's stub-articles dumps).
If you use dumps regularly in your tool, please do subscribe to the
xmldatadumps-l(a)lists.wikimedia.org list [0] for future notices like
this. The list is generally low volume and high signal with
announcements of process change as well as operational issues that may
slow dumps generation.
[0]:
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
Bryan
---------- Forwarded message ----------
From: Ariel Glenn WMF <ariel(a)wikimedia.org>
Date: Thu, May 31, 2018 at 5:36 AM
Subject: [Xmldatadumps-l] change to output file numbering of big wikis
To: Wikipedia Xmldatadumps-l <Xmldatadumps-l(a)lists.wikimedia.org>rg>,
Wikimedia developers <wikitech-l(a)lists.wikimedia.org>
TL;DR:
Scripts that reply on xml files numbered 1 through 4 should be updated
to check for 1 through 6.
Explanation:
A number of wikis have stubs and page content files generated 4 parts
at a time, with the appropriate number added to the filename. I'm
going to be increasing that thi month to 6.
The reason for the increase is that near the end of the run there are
usually just a few big wikis taking their time at completing. If they
run with 6 processes at once, they'll finish up a bit sooner.
If you have scripts that rely on the number 4, just increase it to 6
and you're done.
This will go into effect for the June 1 run and all runs afterwards.
Thanks!
_______________________________________________
Xmldatadumps-l mailing list
Xmldatadumps-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
--
Bryan Davis Wikimedia Foundation <bd808(a)wikimedia.org>
[[m:User:BDavis_(WMF)]] Manager, Cloud Services Boise, ID USA
irc: bd808 v:415.839.6885 x6855