Am 05.12.2012 16:21, schrieb Morten Wang:
Is there a way for me to find that out myself, e.g. using qstat? I had a look at the qstat man-page, but judging by the descriptions it looks like something I'd have to fiddle around with if/when a job gets queued for a long time at some point in the future to figure out how to do.
qstat -j <jobnumber>
lists a scheduling info section.
Example: qstat -j 799111
scheduling info:
queue instance "short-sol@ortelius.toolserver.org" dropped because it is overloaded: np_load_short=1.252930 (= 1.252930 + 0.8 * 0.000000 with nproc=4) >= 1.1 queue instance "longrun-sol@willow.toolserver.org" dropped because it is overloaded: np_load_short=2.528320 (= 2.528320 + 0.8 * 0.000000 with nproc=8) >= 2.0 queue instance "medium-sol@ortelius.toolserver.org" dropped because it is overloaded: np_load_short=1.252930 (= 1.252930 + 0.8 * 0.000000 with nproc=4) >= 0.8 queue instance "longrun2-sol@clematis.toolserver.org" dropped because it is disabled queue instance "longrun2-sol@hawthorn.toolserver.org" dropped because it is disabled (-l h_rt=57600,mem_free=890M,sql=1,sql-s7-rr=3,sqlprocs-s7=3,tmp_free=20M,user_slot=2,virtual_free=890M) cannot run globally because it offers only gc:sql-s7-rr=0.000000
As you can see the job cannot run on clematis and hawthorn, because these queues are disabled. queues on willow and ortelius have temporary high load. wolfsbane, nightshade and yarrow are missing in this list so the bot could start on these servers. But the last line "cannot run globally because it offers only gc:sql-s7-rr=0.000000" shows that resource sql-s7-rr is not available on any server at the moment. That's why the job is queued until s7 database is usable again.
Merlissimo