-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello!
There were 2 messages here during Janury reporting problems with cron. I am now noticing issues with my cronjobs too. By looking at [1] you are able to see that the strange behaviour started somewhen week 2 and 3 (mid January). Do we have again cron (the server) running out of memory or what is the issue here? DaB can you may be give some hints here? Or someone else?
[1] http://munin.toolserver.org/Login/hawthorn/cron_jobs_sh.html
Thanks a lot and greetings! DrTrigon
Hello, At Monday 04 February 2013 01:23:08 DaB. wrote:
Hello!
There were 2 messages here during Janury reporting problems with cron.
Both where on willow AFAIS, which is overloaded.
I am now noticing issues with my cronjobs too.
What exactly is the problem?
By looking at [1] you
are able to see that the strange behaviour started somewhen week 2 and 3 (mid January).
Sorry, I don't see anything. All I see is that the maximum number of cronjobs varies more since a few weeks (but we are way from the number in autumn if you look at the year-graph).
Do we have again cron (the server) running out of memory or what is the issue here? DaB can you may be give some hints here? Or someone else?
I checked hawthorn and there are a few memory-problems at peak-times. I will see if I can add another patch.
Thanks a lot and greetings! DrTrigon
Sincerely, DaB.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 04.02.2013 01:30, DaB. wrote:
I am now noticing issues with my cronjobs too.
What exactly is the problem?
Cronjobs not getting executed (as usual... ;) - at the moment I do also have jobs in SGE queue that do not get runned at all (at least 4).
By looking at [1] you
are able to see that the strange behaviour started somewhen week 2 and 3 (mid January).
Sorry, I don't see anything. All I see is that the maximum number of cronjobs varies more since a few weeks (but we are way from the number in autumn if you look at the year-graph).
Before mid January we had a stable plateau (more or less constant values of jobs per time). - From then it started breaking down - in fact this is just a guess from looking at the data - before it was way more stable...
Do we have again cron (the server) running out of memory or what is the issue here? DaB can you may be give some hints here? Or someone else?
I checked hawthorn and there are a few memory-problems at peak-times. I will see if I can add another patch.
To mention the memory as possible issue was just a guess, but there is definately something wrong and not working as usual.
Thanks for your time DaB and greetings! DrTrigon
a "top" shows that the culprits are likely the same as last time : All the CPU, and a lot of process slots (and cron slots most probably) are currently (ab)used by /home/javadyou/pywikipedia/radeh7.py and /home/reza/pywikipedia/radeh.py
Wolfgang ten Weges/Wolfgang
Le 04/02/2013 20:21, Dr. Trigon a écrit :
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 04.02.2013 01:30, DaB. wrote:
I am now noticing issues with my cronjobs too.
What exactly is the problem?
Cronjobs not getting executed (as usual... ;) - at the moment I do also have jobs in SGE queue that do not get runned at all (at least 4).
By looking at [1] you
are able to see that the strange behaviour started somewhen week 2 and 3 (mid January).
Sorry, I don't see anything. All I see is that the maximum number of cronjobs varies more since a few weeks (but we are way from the number in autumn if you look at the year-graph).
Before mid January we had a stable plateau (more or less constant values of jobs per time).
- From then it started breaking down - in fact this is just a guess from
looking at the data - before it was way more stable...
Do we have again cron (the server) running out of memory or what is the issue here? DaB can you may be give some hints here? Or someone else?
I checked hawthorn and there are a few memory-problems at peak-times. I will see if I can add another patch.
To mention the memory as possible issue was just a guess, but there is definately something wrong and not working as usual.
Thanks for your time DaB and greetings! DrTrigon -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.13 (GNU/Linux) Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/
iEYEARECAAYFAlEQCjoACgkQAXWvBxzBrDCs3QCeOAh2dSykUJlB9l1V/ofmbQMI 88EAoM01EEBqs9NjYZOETQsFD4VkyeR8 =gpCx -----END PGP SIGNATURE-----
Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
On Sun, Feb 10, 2013 at 5:48 PM, Wolfgang ten Weges koneko@aliceadsl.fr wrote:
a "top" shows that the culprits are likely the same as last time : All the CPU, and a lot of process slots (and cron slots most probably) are currently (ab)used by /home/javadyou/pywikipedia/radeh7.py and /home/reza/pywikipedia/radeh.py
There was an announcement on toolserver-l a while back about a new rule that should be in effect now, which should resolve some of these problems:
http://lists.wikimedia.org/pipermail/toolserver-l/2013-January/005625.html
- Carl
I just noticed the text when you login:
"Users are now encouraged to use job scheduling (SGE) for *all* tools!"
Perhaps "encouraged" is no longer the right way to write it?
I've been busy and sick so I did not manage to rewrite my tasks so I stopped them all instead. Perhaps someone could create a tool to extend the number of hours per day? :-D
MGA73
-----Oprindelig meddelelse----- Fra: toolserver-l-bounces@lists.wikimedia.org [mailto:toolserver-l-bounces@lists.wikimedia.org] På vegne af Carl (CBM) Sendt: 12. februar 2013 19:35 Til: Wikimedia Toolserver Emne: Re: [Toolserver-l] Cron on submit
On Sun, Feb 10, 2013 at 5:48 PM, Wolfgang ten Weges koneko@aliceadsl.fr wrote:
a "top" shows that the culprits are likely the same as last time : All the CPU, and a lot of process slots (and cron slots most probably) are currently (ab)used by /home/javadyou/pywikipedia/radeh7.py and /home/reza/pywikipedia/radeh.py
There was an announcement on toolserver-l a while back about a new rule that should be in effect now, which should resolve some of these problems:
http://lists.wikimedia.org/pipermail/toolserver-l/2013-January/005625.html
- Carl
_______________________________________________ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Thanks DaB whatever you (or someone else? ;) did!
Now it works again as you can see from looking at [1] there was clearly a drop in executed jobs now it is on a constant level again! Cool!
[1] http://munin.toolserver.org/Login/hawthorn/cron_jobs_sh.html
Thanks a greetings!! DrTrigon
ps.: DaB what was the solution? I am curious... :)
On 04.02.2013 01:30, DaB. wrote:
Hello, At Monday 04 February 2013 01:23:08 DaB. wrote:
Hello!
There were 2 messages here during Janury reporting problems with cron.
Both where on willow AFAIS, which is overloaded.
I am now noticing issues with my cronjobs too.
What exactly is the problem?
By looking at [1] you
are able to see that the strange behaviour started somewhen week 2 and 3 (mid January).
Sorry, I don't see anything. All I see is that the maximum number of cronjobs varies more since a few weeks (but we are way from the number in autumn if you look at the year-graph).
Do we have again cron (the server) running out of memory or what is the issue here? DaB can you may be give some hints here? Or someone else?
I checked hawthorn and there are a few memory-problems at peak-times. I will see if I can add another patch.
Thanks a lot and greetings! DrTrigon
Sincerely, DaB.
_______________________________________________ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
toolserver-l@lists.wikimedia.org