As was previously announced on the xmldatadumps-l list, the sql/xml dumps generated twice a month will be written to an internal server, starting with the November run. This is in part to reduce load on the web/rsync/nfs server which has been doing this work also until now. We want separation of roles for some other reasons too.
Because I want to get this right, and there are a lot of moving parts, and I don't want to rsync all the prefetch data over to these boxes again next month after cancelling the move:
******** If needed, the November full run will be delayed for a few days. If the November full run takes too long, the partial run, usually starting on the 20th of the month, will not take place. *********
Additionally, as described in an earlier email on the xmldatadumps-l list:
********* files will show up on the web server/rsync server with a substantial delay. Initially this may be a day or more. This includes index.html and other status files. *********
You can keep track of developments here: https://phabricator.wikimedia.org/T178893
If you know folks not on the lists in the recipients field for this email, please forward it to them and suggest that they subscribe to this list.
Thanks,
Ariel
The first set of dumps is running there and looks like it's working ok. I've done a manual rsync of files produced up to this point, so those are now available on the web server.
As before, you can follow work on this at https://phabricator.wikimedia.org/T178893
Note that it is possible that some index.html files may contain links to files which did not get picked up on the rsync. They'll be there sometime tomorrow after the next rsync.
Ariel
On Mon, Oct 30, 2017 at 5:39 PM, Ariel Glenn WMF ariel@wikimedia.org wrote:
As was previously announced on the xmldatadumps-l list, the sql/xml dumps generated twice a month will be written to an internal server, starting with the November run. This is in part to reduce load on the web/rsync/nfs server which has been doing this work also until now. We want separation of roles for some other reasons too.
Because I want to get this right, and there are a lot of moving parts, and I don't want to rsync all the prefetch data over to these boxes again next month after cancelling the move:
If needed, the November full run will be delayed for a few days. If the November full run takes too long, the partial run, usually starting on the 20th of the month, will not take place.
Additionally, as described in an earlier email on the xmldatadumps-l list:
files will show up on the web server/rsync server with a substantial delay. Initially this may be a day or more. This includes index.html and other status files.
You can keep track of developments here: https://phabricator.wikimedia. org/T178893
If you know folks not on the lists in the recipients field for this email, please forward it to them and suggest that they subscribe to this list.
Thanks,
Ariel
Rsync of xml/sql dumps to the web server is now running on a rolling basis via a script, so you should see updates regularly rather than "every $random hours". There's more to be done on that front, see https://phabricator.wikimedia.org/T179857 for what's next.
Ariel
Hi,
Are there problems with some dumps like frwiki with the new system ? On your.org mirror, important files like page-articles are still missing from the 20171103 dump directory, when usually it only takes a day...
Nico
On Mon, Nov 6, 2017 at 8:01 PM, Ariel Glenn WMF ariel@wikimedia.org wrote:
Rsync of xml/sql dumps to the web server is now running on a rolling basis via a script, so you should see updates regularly rather than "every $random hours". There's more to be done on that front, see https://phabricator.wikimedia.org/T179857 for what's next.
Ariel _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
There are no problems that I see. We did get started a couple days late for this run due to the move to an internal server, but I see all jobs running fine. The frwiki page-articles dumps have not yet run; enwiki and wikidatawiki are in progress; eswiki, itwiki, jawiki, and zhwiki are busy writing pages-articles right now, etc. Just give it another couple of days :-)
Ariel
On Tue, Nov 7, 2017 at 7:28 PM, Nicolas Vervelle nvervelle@gmail.com wrote:
Hi,
Are there problems with some dumps like frwiki with the new system ? On your.org mirror, important files like page-articles are still missing from the 20171103 dump directory, when usually it only takes a day...
Nico
On Mon, Nov 6, 2017 at 8:01 PM, Ariel Glenn WMF ariel@wikimedia.org wrote:
Rsync of xml/sql dumps to the web server is now running on a rolling
basis
via a script, so you should see updates regularly rather than "every $random hours". There's more to be done on that front, see https://phabricator.wikimedia.org/T179857 for what's next.
Ariel _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org