Those of you following along will notice that dewiki and wikidatawiki have more files than usual for the page content dumps (pages-meta-history). We'll have more of this going forward; if I get the work done in time, starting April we'll split up these jbos ahead of time into small files that can be rerun right then when they fail, rather than waiting for MediaWiki to split up the output based on run time and wait for a set of MW jobs to complete before retrying failures. This will mean more resiliency when dbs are pulled out of the pool for various reasons (schema changes, upgrades, etc).
Later on during the current run, I hope we will see dumps of magic words and namespaces, provided as json files. Let me put it this way: the code is tested and deployed, now we shall see.
At this very moment, status of a given dump can be retrieved via a file in the current run directory: <wikiname>/20170320/dumpstatus.json These files are updated frequently during the run. You can also get the status of all current runs at https://dumps.wikimedia.org/index.json Thanks to Hydriz for the idea on how a status api could be implemented cheaply. This will probably need some refinement, but feel free to play. More information at https://phabricator.wikimedia.org/T147177
The last of the UI updates went live, thanks to Ladsgroup for all of those fixups. It's nice to enter the new century at last :-)
And finally, we moved all the default config info out into a yaml file (Thanks to Adam Wight for the first version of that changeset). There were a couple hiccups with that, which resulted in my starting the en wikipedia run manually for this run, though via the standard script.
Happy trails,
Ariel
xmldatadumps-l@lists.wikimedia.org