Dear all, Since Tuesday I’ve started the upload of around 8000 pictures of amoeba, everything was going smoothly until yesterday afternoon.
It seems that at a moment we have saturated our own server with too many parallel request and the GWToolset start facing "HTTP request timed out » , as you can see in the GWToolset log https://commons.wikimedia.org/w/index.php?title=Special:Log&type=gwtools... https://commons.wikimedia.org/w/index.php?title=Special:Log&type=gwtoolset it the same story since yesterday around 5PM.
I’ve asked our CIO, and he told me that indeed GWToolset is using the whole bandwidth of our server :-(
Now I don’t know what to do? is there a moment GWToolset will stop doing request to the WMCH server? should we restart our server?
incidentally , the account used for setting the batch upload receive now several « Error: 1205 Lock wait timeout exceeded; try restarting transaction (10.0.6.41) «
Do you think it’s related, I saw that someone already mentionned this error while using GWToolset.
Thanks to all for your help.
Cheers
Charles
___________________________________________________________
Charles ANDRES, Chief Science Officer "Wikimedia CH" – Association for the advancement of free knowledge – www.wikimedia.ch http://www.wikimedia.ch/ Office +41 (0)21 340 66 21 Mobile +41 (0)78 910 00 97 Skype: charles.andres.wmch IRC://irc.freenode.net/wikimedia-ch irc://irc.freenode.net/wikimedia-ch http://prezi.com/user/Andrescharles/
If gwtoolset is overloading your server and you need it to stop right away (e.g. its an emergency) contact someone in #wikimedia-operations and explain the situation.
--bawolff On May 29, 2015 7:48 AM, "charles andrès" charles.andres@wikimedia.ch wrote:
Dear all, Since Tuesday I’ve started the upload of around 8000 pictures of amoeba,
everything was going smoothly until yesterday afternoon.
It seems that at a moment we have saturated our own server with too many
parallel request and the GWToolset start facing "HTTP request timed out » , as you can see in the GWToolset log https://commons.wikimedia.org/w/index.php?title=Special:Log&type=gwtools... it the same story since yesterday around 5PM.
I’ve asked our CIO, and he told me that indeed GWToolset is using the
whole bandwidth of our server :-(
Now I don’t know what to do? is there a moment GWToolset will stop doing
request to the WMCH server? should we restart our server?
incidentally , the account used for setting the batch upload receive now
several « Error: 1205 Lock wait timeout exceeded; try restarting transaction (10.0.6.41) «
Do you think it’s related, I saw that someone already mentionned this
error while using GWToolset.
Thanks to all for your help.
Cheers
Charles
Charles ANDRES, Chief Science Officer "Wikimedia CH" – Association for the advancement of free knowledge – www.wikimedia.ch Office +41 (0)21 340 66 21 Mobile +41 (0)78 910 00 97 Skype: charles.andres.wmch IRC://irc.freenode.net/wikimedia-ch http://prezi.com/user/Andrescharles/
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Yes, our server is now saturated, and the job is taking longer than it should if it would be successful :-(
I asked saturday on #wikimedia-operations but nobody answered, I’m trying again today.
charles
___________________________________________________________
Charles ANDRES, Chief Science Officer "Wikimedia CH" – Association for the advancement of free knowledge – www.wikimedia.ch http://www.wikimedia.ch/ Office +41 (0)21 340 66 21 Mobile +41 (0)78 910 00 97 Skype: charles.andres.wmch IRC://irc.freenode.net/wikimedia-ch irc://irc.freenode.net/wikimedia-ch http://prezi.com/user/Andrescharles/
Le 30 mai 2015 à 00:26, Brian Wolff bawolff@gmail.com a écrit :
If gwtoolset is overloading your server and you need it to stop right away (e.g. its an emergency) contact someone in #wikimedia-operations and explain the situation.
--bawolff On May 29, 2015 7:48 AM, "charles andrès" <charles.andres@wikimedia.ch mailto:charles.andres@wikimedia.ch> wrote:
Dear all, Since Tuesday I’ve started the upload of around 8000 pictures of amoeba, everything was going smoothly until yesterday afternoon.
It seems that at a moment we have saturated our own server with too many parallel request and the GWToolset start facing "HTTP request timed out » , as you can see in the GWToolset log https://commons.wikimedia.org/w/index.php?title=Special:Log&type=gwtools... https://commons.wikimedia.org/w/index.php?title=Special:Log&type=gwtoolset it the same story since yesterday around 5PM.
I’ve asked our CIO, and he told me that indeed GWToolset is using the whole bandwidth of our server :-(
Now I don’t know what to do? is there a moment GWToolset will stop doing request to the WMCH server? should we restart our server?
incidentally , the account used for setting the batch upload receive now several « Error: 1205 Lock wait timeout exceeded; try restarting transaction (10.0.6.41) «
Do you think it’s related, I saw that someone already mentionned this error while using GWToolset.
Thanks to all for your help.
Cheers
Charles
Charles ANDRES, Chief Science Officer "Wikimedia CH" – Association for the advancement of free knowledge – www.wikimedia.ch http://www.wikimedia.ch/ Office +41 (0)21 340 66 21 Mobile +41 (0)78 910 00 97 Skype: charles.andres.wmch IRC://irc.freenode.net/wikimedia-ch http://irc.freenode.net/wikimedia-ch http://prezi.com/user/Andrescharles/ http://prezi.com/user/Andrescharles/
Glamtools mailing list Glamtools@lists.wikimedia.org mailto:Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools https://lists.wikimedia.org/mailman/listinfo/glamtools
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Ok, guys are looking into it right now, but it doesn’t seems to be so easy to do.
Stupid question, is there already discussion to have an « emergency button » to stop a GWToolset job?
charles
___________________________________________________________
Charles ANDRES, Chief Science Officer "Wikimedia CH" – Association for the advancement of free knowledge – www.wikimedia.ch http://www.wikimedia.ch/ Office +41 (0)21 340 66 21 Mobile +41 (0)78 910 00 97 Skype: charles.andres.wmch IRC://irc.freenode.net/wikimedia-ch irc://irc.freenode.net/wikimedia-ch http://prezi.com/user/Andrescharles/
Le 1 juin 2015 à 10:34, Charles Andrès charles.andres@wikimedia.ch a écrit :
Yes, our server is now saturated, and the job is taking longer than it should if it would be successful :-(
I asked saturday on #wikimedia-operations but nobody answered, I’m trying again today.
charles
Charles ANDRES, Chief Science Officer "Wikimedia CH" – Association for the advancement of free knowledge – www.wikimedia.ch http://www.wikimedia.ch/ Office +41 (0)21 340 66 21 Mobile +41 (0)78 910 00 97 Skype: charles.andres.wmch IRC://irc.freenode.net/wikimedia-ch irc://irc.freenode.net/wikimedia-ch http://prezi.com/user/Andrescharles/ http://prezi.com/user/Andrescharles/
Le 30 mai 2015 à 00:26, Brian Wolff <bawolff@gmail.com mailto:bawolff@gmail.com> a écrit :
If gwtoolset is overloading your server and you need it to stop right away (e.g. its an emergency) contact someone in #wikimedia-operations and explain the situation.
--bawolff On May 29, 2015 7:48 AM, "charles andrès" <charles.andres@wikimedia.ch mailto:charles.andres@wikimedia.ch> wrote:
Dear all, Since Tuesday I’ve started the upload of around 8000 pictures of amoeba, everything was going smoothly until yesterday afternoon.
It seems that at a moment we have saturated our own server with too many parallel request and the GWToolset start facing "HTTP request timed out » , as you can see in the GWToolset log https://commons.wikimedia.org/w/index.php?title=Special:Log&type=gwtools... https://commons.wikimedia.org/w/index.php?title=Special:Log&type=gwtoolset it the same story since yesterday around 5PM.
I’ve asked our CIO, and he told me that indeed GWToolset is using the whole bandwidth of our server :-(
Now I don’t know what to do? is there a moment GWToolset will stop doing request to the WMCH server? should we restart our server?
incidentally , the account used for setting the batch upload receive now several « Error: 1205 Lock wait timeout exceeded; try restarting transaction (10.0.6.41) «
Do you think it’s related, I saw that someone already mentionned this error while using GWToolset.
Thanks to all for your help.
Cheers
Charles
Charles ANDRES, Chief Science Officer "Wikimedia CH" – Association for the advancement of free knowledge – www.wikimedia.ch http://www.wikimedia.ch/ Office +41 (0)21 340 66 21 Mobile +41 (0)78 910 00 97 Skype: charles.andres.wmch IRC://irc.freenode.net/wikimedia-ch http://irc.freenode.net/wikimedia-ch http://prezi.com/user/Andrescharles/ http://prezi.com/user/Andrescharles/
Glamtools mailing list Glamtools@lists.wikimedia.org mailto:Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools https://lists.wikimedia.org/mailman/listinfo/glamtools
Glamtools mailing list Glamtools@lists.wikimedia.org mailto:Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
There's no kill button. The job would have to be killed by an operator (a phabricator request is probably in order). As a general option, it would be a bad idea to have a general kill process, what would be more useful would be a process for an end user to kill/pause a job request using a given job ID. There are no current plans to implement anything like this, something to add to a Stage 2 if that ever happens.
Out of interest, how many processing threads were chosen in GWT for the job? It may be an idea if the input page is changed to default to 2 threads and there are warnings if you have more than 8 or so. I can imagine 20 processing threads causing a server issue for large files and in practice I used 4 or 5 for my largest upload jobs; probably something to usefully add to the user guide.
Fae
On 1 June 2015 at 10:35, Charles Andrès charles.andres@wikimedia.ch wrote:
Ok, guys are looking into it right now, but it doesn’t seems to be so easy to do.
Stupid question, is there already discussion to have an « emergency button » to stop a GWToolset job?
charles
On 6/2/15, Fæ faewik@gmail.com wrote:
There's no kill button. The job would have to be killed by an operator (a phabricator request is probably in order).
apergos took care of that.
As a general option, it would be a bad idea to have a general kill process, what would be more useful would be a process for an end user to kill/pause a job request using a given job ID. There are no current plans to implement anything like this, something to add to a Stage 2 if that ever happens.
See https://phabricator.wikimedia.org/T100972
--bawolff
Out of interest, how many processing threads were chosen in GWT for the job? It may be an idea if the input page is changed to default to 2 threads and there are warnings if you have more than 8 or so. I can imagine 20 processing threads causing a server issue for large files and in practice I used 4 or 5 for my largest upload jobs; probably something to usefully add to the user guide.
I kept the by default setting of 5. My guess is that when our server get overloaded, GWToolset start making more request, those it would have done natural, and repeating those who were failing, but it’s just my guess.
Charles
___________________________________________________________
Charles ANDRES, Chief Science Officer "Wikimedia CH" – Association for the advancement of free knowledge – www.wikimedia.ch http://www.wikimedia.ch/ Office +41 (0)21 340 66 21 Mobile +41 (0)78 910 00 97 Skype: charles.andres.wmch IRC://irc.freenode.net/wikimedia-ch irc://irc.freenode.net/wikimedia-ch http://prezi.com/user/Andrescharles/
On 6/3/15, Charles Andrès charles.andres@wikimedia.ch wrote:
Out of interest, how many processing threads were chosen in GWT for the job? It may be an idea if the input page is changed to default to 2 threads and there are warnings if you have more than 8 or so. I can imagine 20 processing threads causing a server issue for large files and in practice I used 4 or 5 for my largest upload jobs; probably something to usefully add to the user guide.
I kept the by default setting of 5. My guess is that when our server get overloaded, GWToolset start making more request, those it would have done natural, and repeating those who were failing, but it’s just my guess.
Charles
Hmm, I'm not sure if it would try multiple times for requests that fail, but in terms of things logged, that does not appear to be the case (But it might try multiple time per item logged as failure).
Here's the day by day of your upload job (for User:Neuchâtel Herbarium)
MariaDB [commonswiki_p]> select substr( log_timestamp, 1, 8 ), log_action, count(*) from logging_logindex where log_type = 'gwtoolset' and log_timestamp > '20150500000000' and log_user = 2103899 and log_action != 'metadata-job' group by 1, 2; +-------------------------------+-------------------------+----------+ | substr( log_timestamp, 1, 8 ) | log_action | count(*) | +-------------------------------+-------------------------+----------+ | 20150526 | mediafile-job-failed | 378 | | 20150526 | mediafile-job-succeeded | 926 | | 20150527 | mediafile-job-failed | 115 | | 20150527 | mediafile-job-succeeded | 3734 | | 20150528 | mediafile-job-failed | 6431 | | 20150528 | mediafile-job-succeeded | 6327 | | 20150529 | mediafile-job-failed | 12148 | | 20150530 | mediafile-job-failed | 11915 | | 20150531 | mediafile-job-failed | 12371 | | 20150531 | mediafile-job-succeeded | 6 | | 20150601 | mediafile-job-failed | 7636 | | 20150601 | mediafile-job-succeeded | 225 | +-------------------------------+-------------------------+----------+ 12 rows in set (0.56 sec)
On May 28th, about 50% of the files failed, however the number of files attempted to be fetched was roughly the same as on may 29 when every single file failed.
I think this suggests that gwtoolset should have some sort of back-off feature when things start to fail (particularly due to "HTTP request timed out.") to slow down the request rate.
--bawolff
I’m not really competent to understand the technical aspect, if it can help, here and exemple of the request who were actually done to our server when all job were failing:
208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:17:23 +0200] "HEAD /penard/Collection_Penard_MHNG_Specimen_343-5-5.tif HTTP/1.1" 200 0 "-" "MediaWiki/1.26wmf7 GWToolset/0.3.8" 208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:17:28 +0200] "HEAD /penard/Collection_Penard_MHNG_Specimen_344-1-2.tif HTTP/1.1" 200 0 "-" "MediaWiki/1.26wmf7 GWToolset/0.3.8" 208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:17:29 +0200] "HEAD /penard/Collection_Penard_MHNG_Specimen_344-2-2.tif HTTP/1.1" 200 0 "-" "MediaWiki/1.26wmf7 GWToolset/0.3.8" 208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:17:31 +0200] "HEAD /penard/Collection_Penard_MHNG_Specimen_344-1-1.tif HTTP/1.1" 200 0 "-" "MediaWiki/1.26wmf7 GWToolset/0.3.8" 208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:17:38 +0200] "HEAD /penard/Collection_Penard_MHNG_Specimen_344-1-3.tif HTTP/1.1" 200 0 "-" "MediaWiki/1.26wmf7 GWToolset/0.3.8" 208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:17:41 +0200] "GET /penard/Collection_Penard_MHNG_Specimen_339-17-3.tif HTTP/1.1" 200 3322941 "-" "MediaWiki/1.26wmf7" 208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:17:41 +0200] "HEAD /penard/Collection_Penard_MHNG_Specimen_344-2-3.tif HTTP/1.1" 200 0 "-" "MediaWiki/1.26wmf7 GWToolset/0.3.8" 208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:17:50 +0200] "GET /penard/Collection_Penard_MHNG_Specimen_345-4-2.tif HTTP/1.1" 200 867440 "-" "MediaWiki/1.26wmf7" 208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:17:52 +0200] "GET /penard/Collection_Penard_MHNG_Specimen_342-2-1.tif HTTP/1.1" 200 1837688 "-" "MediaWiki/1.26wmf7" 208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:17:55 +0200] "GET /penard/Collection_Penard_MHNG_Specimen_342-3-1.tif HTTP/1.1" 200 1195016 "-" "MediaWiki/1.26wmf7" 208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:18:01 +0200] "HEAD /penard/Collection_Penard_MHNG_Specimen_341-2-2.tif HTTP/1.1" 200 0 "-" "MediaWiki/1.26wmf7 GWToolset/0.3.8" 208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:18:02 +0200] "HEAD /penard/Collection_Penard_MHNG_Specimen_341-2-3.tif HTTP/1.1" 200 0 "-" "MediaWiki/1.26wmf7 GWToolset/0.3.8" 208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:18:04 +0200] "HEAD /penard/Collection_Penard_MHNG_Specimen_344-2-4.tif HTTP/1.1" 200 0 "-" "MediaWiki/1.26wmf7 GWToolset/0.3.8" 208.80.154.156 lan.wikimedia.ch http://lan.wikimedia.ch/ - [29/May/2015:13:18:07 +0200] "GET /penard/Collection_Penard_MHNG_Specimen_366-2-2.tif HTTP/1.1" 200 1035100 "-" "MediaWiki/1.26wmf7 »
___________________________________________________________
Charles ANDRES, Chief Science Officer "Wikimedia CH" – Association for the advancement of free knowledge – www.wikimedia.ch http://www.wikimedia.ch/ Office +41 (0)21 340 66 21 Mobile +41 (0)78 910 00 97 Skype: charles.andres.wmch IRC://irc.freenode.net/wikimedia-ch irc://irc.freenode.net/wikimedia-ch http://prezi.com/user/Andrescharles/
Le 3 juin 2015 à 16:10, Brian Wolff bawolff@gmail.com a écrit :
On 6/3/15, Charles Andrès charles.andres@wikimedia.ch wrote:
Out of interest, how many processing threads were chosen in GWT for the job? It may be an idea if the input page is changed to default to 2 threads and there are warnings if you have more than 8 or so. I can imagine 20 processing threads causing a server issue for large files and in practice I used 4 or 5 for my largest upload jobs; probably something to usefully add to the user guide.
I kept the by default setting of 5. My guess is that when our server get overloaded, GWToolset start making more request, those it would have done natural, and repeating those who were failing, but it’s just my guess.
Charles
Hmm, I'm not sure if it would try multiple times for requests that fail, but in terms of things logged, that does not appear to be the case (But it might try multiple time per item logged as failure).
Here's the day by day of your upload job (for User:Neuchâtel Herbarium)
MariaDB [commonswiki_p]> select substr( log_timestamp, 1, 8 ), log_action, count(*) from logging_logindex where log_type = 'gwtoolset' and log_timestamp > '20150500000000' and log_user = 2103899 and log_action != 'metadata-job' group by 1, 2; +-------------------------------+-------------------------+----------+ | substr( log_timestamp, 1, 8 ) | log_action | count(*) | +-------------------------------+-------------------------+----------+ | 20150526 | mediafile-job-failed | 378 | | 20150526 | mediafile-job-succeeded | 926 | | 20150527 | mediafile-job-failed | 115 | | 20150527 | mediafile-job-succeeded | 3734 | | 20150528 | mediafile-job-failed | 6431 | | 20150528 | mediafile-job-succeeded | 6327 | | 20150529 | mediafile-job-failed | 12148 | | 20150530 | mediafile-job-failed | 11915 | | 20150531 | mediafile-job-failed | 12371 | | 20150531 | mediafile-job-succeeded | 6 | | 20150601 | mediafile-job-failed | 7636 | | 20150601 | mediafile-job-succeeded | 225 | +-------------------------------+-------------------------+----------+ 12 rows in set (0.56 sec)
On May 28th, about 50% of the files failed, however the number of files attempted to be fetched was roughly the same as on may 29 when every single file failed.
I think this suggests that gwtoolset should have some sort of back-off feature when things start to fail (particularly due to "HTTP request timed out.") to slow down the request rate.
--bawolff
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
On 3 June 2015 at 15:10, Brian Wolff bawolff@gmail.com wrote: ...
I think this suggests that gwtoolset should have some sort of back-off feature when things start to fail (particularly due to "HTTP request timed out.") to slow down the request rate.
When I used to analyse big stuff like oil rigs, part of the basic 1980's engineering was to keep on poking at what happens during 'black out' and 'brown out' conditions. I guess in the post-Agile 2015 world this falls under boundary case testing. When we were specifying the GWT this was the sort of thinking for user friendly robustness that never made it into 'phase one'.
Charles' case is good stuff to tease out into some feature requests, even if we only park them for the moment. Hopefully with evolving strategies, the WMF will adopt more of this, or indeed provide a playpen where the nuts and bolts of internet transaction problems are handled by off-the-shelf secondary playpen modules. Okay, it's fantasy, but you never know. ;-)
Fae
[..] or indeed provide a playpen where the nuts and bolts of internet transaction problems are handled by off-the-shelf secondary playpen modules. Okay, it's fantasy, but you never know. ;-)
Fae
I don't think I understand what you mean by that. Are you hoping that gwtoolset will be allowed to load things from tool labs in the future?
-- bawolff
the original throttling in the gwtoolset form allowed a user to limit the number of threads carried out per job runner run, but i think that has changed. someone on the previous multimedia team might know if the throttle field still has any impact or not. if it does, try setting it to 1.
still, this issue brings up a few features that would be ideal. a method that allows the originator of the batch job or an admin:
* the ability to cancel the batch job * the ability to slow down the batch job * the ability to monitor the batch job
with kind regards, dan
On Wed, Jun 3, 2015 at 11:51 AM, Brian Wolff bawolff@gmail.com wrote:
[..] or indeed provide a playpen where the nuts and bolts of internet transaction problems are handled by off-the-shelf secondary playpen modules. Okay, it's fantasy, but you never know. ;-)
Fae
I don't think I understand what you mean by that. Are you hoping that gwtoolset will be allowed to load things from tool labs in the future?
-- bawolff
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
On Thu, Jun 4, 2015 at 1:57 PM, dan entous dan.entous.wikimedia@gmail.com wrote:
the original throttling in the gwtoolset form allowed a user to limit the number of threads carried out per job runner run, but i think that has changed. someone on the previous multimedia team might know if the throttle field still has any impact or not. if it does, try setting it to 1.
That should still work AFAIK.
Dear all,
My batch jobs of Japanese colour drawings from Naturalis stop after the first intake, of say 5, 10 or 15 images (choice of throttle). It's still better than UpLoadWizard, but...
Compare
https://commons.wikimedia.org/w/index.php?title=Special:ListFiles/Hansmuller...
My fault or a server quirk?
Thanks, hans muller Wikipedian in residence at Naturalis, Leiden
Looks like the job is still running, so you need to wait. Looks like there are some duplicates: 18:03, 20 July 2015 Hansmuller (talk | contribs | block) mediafile job failed. Metadata record 107. <Duplicate media file: This media file already exists and has the same title "File:Naturalis Biodiversity Center - RMNH.ART.120 - Paraplagusia japonica (Temminck and Schlegel) - Kawahara Keiga - 1823 - 1829 - Siebold Collection - pencil drawing - water colour.jpeg".> original URL: http://medialib.naturalis.nl/file/id/RMNH.ART.120/format/large evaluated URL: http://medialib.naturalis.nl/file/id/RMNH.ART.120/format/large. (kawahara-grote-upload-string-verkort-1TM13-256-537-ERUIT.xml, mapping kawahara-1.json)
Date: Mon, 20 Jul 2015 12:21:35 +0200 From: j.m.muller@hccnet.nl To: glamtools@lists.wikimedia.org Subject: [Glamtools] Batch job stops after first haul
Dear all,
My batch jobs of Japanese colour drawings from Naturalis stop after the first intake, of say 5, 10 or 15 images (choice of throttle). It's still better than UpLoadWizard, but...
Compare
https://commons.wikimedia.org/w/index.php?title=Special:ListFiles/Hansmuller...
My fault or a server quirk?
Thanks, hans muller Wikipedian in residence at Naturalis, Leiden
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Looks like the job is still running, so you need to wait. Looks like there are some duplicates: 18:03, 20 July 2015 Hansmuller (talk | contribs | block) mediafile job failed. Metadata record 107. <Duplicate media file: This media file already exists and has the same title "File:Naturalis Biodiversity Center - RMNH.ART.120 - Paraplagusia japonica (Temminck and Schlegel) - Kawahara Keiga - 1823 - 1829 - Siebold Collection - pencil drawing - water colour.jpeg".> original URL: http://medialib.naturalis.nl/file/id/RMNH.ART.120/format/large evaluated URL: http://medialib.naturalis.nl/file/id/RMNH.ART.120/format/large. (kawahara-grote-upload-string-verkort-1TM13-256-537-ERUIT.xml, mapping kawahara-1.json)
Date: Mon, 20 Jul 2015 12:21:35 +0200 From: j.m.muller@hccnet.nl To: glamtools@lists.wikimedia.org Subject: [Glamtools] Batch job stops after first haul
Dear all,
My batch jobs of Japanese colour drawings from Naturalis stop after the first intake, of say 5, 10 or 15 images (choice of throttle). It's still better than UpLoadWizard, but...
Compare
https://commons.wikimedia.org/w/index.php?title=Special:ListFiles/Hansmuller...
My fault or a server quirk?
Thanks, hans muller Wikipedian in residence at Naturalis, Leiden
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Thanks, you were right! But why can i only upload 15 100 kb-file an hour while others can do 5 large files a minute? Can i run multiple jobs in parallel?
Thanks, hans muller
Op Ma, 20 juli, 2015 8:07 pm schreef Steinsplitter Wiki:
Looks like the job is still running, so you need to wait. Looks like there are some duplicates: 18:03, 20 July 2015 Hansmuller (talk | contribs | block) mediafile job failed. Metadata record 107. <Duplicate media file: This media file already exists and has the same title "File:Naturalis Biodiversity Center - RMNH.ART.120 - Paraplagusia japonica (Temminck and Schlegel) - Kawahara Keiga - 1823 - 1829 - Siebold Collection - pencil drawing - water colour.jpeg".> original URL: http://medialib.naturalis.nl/file/id/RMNH.ART.120/format/large evaluated URL: http://medialib.naturalis.nl/file/id/RMNH.ART.120/format/large. (kawahara-grote-upload-string-verkort-1TM13-256-537-ERUIT.xml, mapping kawahara-1.json)
Date: Mon, 20 Jul 2015 12:21:35 +0200 From: j.m.muller@hccnet.nl To: glamtools@lists.wikimedia.org Subject: [Glamtools] Batch job stops after first haul
Dear all,
My batch jobs of Japanese colour drawings from Naturalis stop after the first intake, of say 5, 10 or 15 images (choice of throttle). It's still better than UpLoadWizard, but...
Compare
https://commons.wikimedia.org/w/index.php?title=Special:ListFiles/Hansm uller&ilshowall=1
My fault or a server quirk?
Thanks, hans muller Wikipedian in residence at Naturalis, Leiden
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
Glamtools mailing list Glamtools@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/glamtools
charles andrès, 29/05/2015 15:47:
I’ve asked our CIO, and he told me that indeed GWToolset is using the whole bandwidth of our server :-(
Assuming the concurrency limit in GWT works, perhaps your webserver should limit the bandwidth a single user/IP range can consume, as download.wikimedia.org does (but hopefully not as strict).
Nemo