Hello all,
a few users contacted me about their not running cron-tasks. A often found problem is, that the cron-lines of these user are like the following:
0 0 * * * DoSomething
or
0 * * * * DoSomething
In a ideal world that would be no problem, but in real world that CAN be a problem. Why? Because many users have the same idea and our submit-hosts fail than with
(CRON) CAN'T FORK (child_process): Not enough space.
Last night 41 tasks were successful started at midnight, an unknown number failed. Of course we could just hit the problem with buying new hardware, but most time of the day these hosts do idle. So how to solve this problem? It's easy: Spread the load. Most times a task (like a bot) do not care if it is started a few minutes earlier or later. So choose a minute that is unlike 0 and not divisible without remainder by 5. If it really does not matter for you when your task starts, then take the position of the first letter of your user-name and add 2 ("dab" → "d" → 4 → 6). To not produce a misunderstanding: If your task REALLY needs to start at minute 0 (or at midnight): do it. An of course cron-tasks are failing for other reasons to, so contact me (jira-bug preferred) if you have a problem.
Sincerely, DaB.
Hello,
An of course cron-tasks are failing for other reasons to, so contact me (jira-bug preferred) if you have a problem.
What's about https://jira.toolserver.org/browse/TS-1421 ? I opened this bug months ago and there is still no solution. I still get a lot of stupid mails every day and a lot of cron jobs does not get started (randomly).
frustrated greetings, Aka (André)
Hello, At Thursday 13 September 2012 23:41:27 DaB. wrote:
What's about https://jira.toolserver.org/browse/TS-1421 ?
sorry, no solution yet.
Sincerely, DaB.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
What about a tool that gather some statistics from all users cron(ie)tab files and show them on a public (or for logged-in users only) page/place.
The idea is that user can get an clue on what times have high or low job loads.
Something like:
00 - used xx times - xx% of all jobs 01 - used xx times - xx% of all jobs 02 - ... 03
Might also include hours and more info, BUT NO per user data just overall averages and counts.
Greetings DrTrigon
On 13.09.2012 22:59, DaB. wrote:
Hello all,
a few users contacted me about their not running cron-tasks. A often found problem is, that the cron-lines of these user are like the following:
0 0 * * * DoSomething
or
0 * * * * DoSomething
In a ideal world that would be no problem, but in real world that CAN be a problem. Why? Because many users have the same idea and our submit-hosts fail than with
(CRON) CAN'T FORK (child_process): Not enough space.
Last night 41 tasks were successful started at midnight, an unknown number failed. Of course we could just hit the problem with buying new hardware, but most time of the day these hosts do idle. So how to solve this problem? It's easy: Spread the load. Most times a task (like a bot) do not care if it is started a few minutes earlier or later. So choose a minute that is unlike 0 and not divisible without remainder by 5. If it really does not matter for you when your task starts, then take the position of the first letter of your user-name and add 2 ("dab" ? "d" ? 4 ? 6). To not produce a misunderstanding: If your task REALLY needs to start at minute 0 (or at midnight): do it. An of course cron-tasks are failing for other reasons to, so contact me (jira-bug preferred) if you have a problem.
Sincerely, DaB.
_______________________________________________ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
On 14/09/12 01:42, Dr. Trigon wrote:
What about a tool that gather some statistics from all users cron(ie)tab files and show them on a public (or for logged-in users only) page/place.
The idea is that user can get an clue on what times have high or low job loads.
Something like:
00 - used xx times - xx% of all jobs 01 - used xx times - xx% of all jobs 02 - ... 03
Might also include hours and more info, BUT NO per user data just overall averages and counts.
Greetings DrTrigon
Ideally, you could mark a task as being daily-I-don't-care-when, or perhaps "run each 20-28h", and cron would choose the time that best suited itself, taking all registered jobs into acocunt.
Platonides platonides@gmail.com wrote:
[...] Ideally, you could mark a task as being daily-I-don't-care-when, or perhaps "run each 20-28h", and cron would choose the time that best suited itself, taking all registered jobs into acocunt.
fcron for example accomplishes that, but AFAIS is almost un- maintained (and - at least on Fedora - doesn't work in SELi- nux environments).
Tim
Am 14.09.2012 14:57, schrieb Tim Landscheidt:
DaB. wrote:
In a ideal world that would be no problem, but in real world that CAN be a problem. Why? Because many users have the same idea and our submit-hosts fail than with
(CRON) CAN'T FORK (child_process): Not enough space.
Last night 41 tasks were successful started at midnight, an unknown number failed. Of course we could just hit the problem with buying new hardware, but most time of the day these hosts do idle.
On solaris cron fixing this problem is easy because you can change the queue config using /etc/cron.d/queuedefs (see man queuedefs for more info).
There you could define e.g. "c.35j3n17w" which means that only 35 jobs are started in parallel and the rest is rescheduled after 17 seconds if there are free slots. The standard solaris config "c.100j2n60w" would be bad, because it starts more than 41 jobs and the rest is reschuduled after 60 seconds when all the next cron jobs are starting, too.
Does anybody know if vixie cron (=cronie on ts) supports sth. similar? That would solve the problem.
btw.: This bug only exists because many people on this mailinglist did not like the solaris crontab format and requested to install vixie cron as alternative cron some years ago.
Merlissimo
Merlissimo merl@toolserver.org wrote:
[...] Does anybody know if vixie cron (=cronie on ts) supports sth. similar? That would solve the problem. [...]
Not as far as I know (or see in the code).
Tim
Hello, At Friday 14 September 2012 17:19:31 DaB. wrote:
What about a tool that gather some statistics from all users cron(ie)tab files and show them on a public (or for logged-in users only) page/place.
attached is a overview of the number of successful started cron-jobs of yesterday.
Sincerely, DaB.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 14.09.2012 16:46, Merlissimo wrote:
btw.: This bug only exists because many people on this mailinglist did not like the solaris crontab format and requested to install vixie cron as alternative cron some years ago.
As far as I remember my main concer was about "*/14" syntax in cron. Am I wrong and solaris cron does not support this at all? Otherwise I would be fine switching to solaris cron if this helps!
On 14.09.2012 17:20, DaB. wrote:
attached is a overview of the number of successful started cron-jobs of yesterday.
ThANKS A LOT that's great!! What about creating a web-page on the TS server containing such lists (e.g. with some graphicals display as well) for future use?
Greetings DrTrigon
Hello, At Tuesday 09 October 2012 13:12:04 DaB. wrote:
On 14.09.2012 17:20, DaB. wrote:
attached is a overview of the number of successful started cron-jobs of yesterday.
ThANKS A LOT that's great!! What about creating a web-page on the TS server containing such lists (e.g. with some graphicals display as well) for future use?
yesterday I hacked together a munin-script for that; you can find it at [1] for hawthorn (the current submit-host). I will add it to clematis (the other submit-host) and maybe the other hosts too today.
Sincerely, DaB.
[1] http://munin.toolserver.org/Login/hawthorn/cron_jobs_sh.html
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 09.10.2012 13:14, DaB. wrote:
yesterday I hacked together a munin-script for that; you can find it at [1] for hawthorn (the current submit-host). I will add it to clematis (the other submit-host) and maybe the other hosts too today.
Sincerely, DaB.
[1] http://munin.toolserver.org/Login/hawthorn/cron_jobs_sh.html
Cool! Nice job!! Thanks a lot for this!
A hopefully small feature request; would it be possible to have a more detailed view added? Max 1 day (24 hours) in order to resolve the minutes too... (...but anyway, good work!)
Thanks and greetings DrTrigon
ps.: the vixie cron patch looks good so far...(!)
Hello, At Tuesday 09 October 2012 20:57:01 DaB. wrote:
A hopefully small feature request; would it be possible to have a more detailed view added? Max 1 day (24 hours) in order to resolve the minutes too... (...but anyway, good work!)
I would like that, but munin does not support that AFAIK.
Sincerely, DaB.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
On 09.10.2012 20:57, DaB. wrote:
Hello, At Tuesday 09 October 2012 20:57:01 DaB. wrote:
A hopefully small feature request; would it be possible to have a more detailed view added? Max 1 day (24 hours) in order to resolve the minutes too... (...but anyway, good work!)
I would like that, but munin does not support that AFAIK.
That's a pitty... ;) ...but thanks anyway!!
Not sure if your hack is related, but I'm being spammed by cron in last about 30 mins I've got about 40 cron emails...
Danny B.
------------ Původní zpráva ------------ Od: DaB. WP@daniel.baur4.info Předmět: Re: [Toolserver-l] When to execute cron-tasks Datum: 09.10.2012 13:17:30
Hello, At Tuesday 09 October 2012 13:12:04 DaB. wrote:
On 14.09.2012 17:20, DaB. wrote:
attached is a overview of the number of successful started cron-jobs of yesterday.
ThANKS A LOT that's great!! What about creating a web-page on the TS server containing such lists (e.g. with some graphicals display as well) for future use?
yesterday I hacked together a munin-script for that; you can find it at [1] for
hawthorn (the current submit-host). I will add it to clematis (the other submit-host) and maybe the other hosts too today.
Sincerely, DaB.
[1] http://munin.toolserver.org/Login/hawthorn/cron_jobs_sh.html
-- Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1
Hello DaB
As we discussed yesterday evening I changed minutes for my cronjobs from 00 to 07. As I explained yesterday only 4 of 5 jobs runned, today after the change only (!) 2 of 5 jobs runned. So this change did not help, but made it worse... any ideas? ;))
Thanks and greetings DrTrigon
On 13.09.2012 22:59, DaB. wrote:
Hello all,
a few users contacted me about their not running cron-tasks. A often found problem is, that the cron-lines of these user are like the following:
0 0 * * * DoSomething
or
0 * * * * DoSomething
In a ideal world that would be no problem, but in real world that CAN be a problem. Why? Because many users have the same idea and our submit-hosts fail than with
(CRON) CAN'T FORK (child_process): Not enough space.
Last night 41 tasks were successful started at midnight, an unknown number failed. Of course we could just hit the problem with buying new hardware, but most time of the day these hosts do idle. So how to solve this problem? It's easy: Spread the load. Most times a task (like a bot) do not care if it is started a few minutes earlier or later. So choose a minute that is unlike 0 and not divisible without remainder by 5. If it really does not matter for you when your task starts, then take the position of the first letter of your user-name and add 2 ("dab" → "d" → 4 → 6). To not produce a misunderstanding: If your task REALLY needs to start at minute 0 (or at midnight): do it. An of course cron-tasks are failing for other reasons to, so contact me (jira-bug preferred) if you have a problem.
Sincerely, DaB.
_______________________________________________ Toolserver-l mailing list (Toolserver-l@lists.wikimedia.org) https://lists.wikimedia.org/mailman/listinfo/toolserver-l Posting guidelines for this list: https://wiki.toolserver.org/view/Mailing_list_etiquette
toolserver-l@lists.wikimedia.org