Hello,
At Friday 03 May 2013 10:54:42 DaB. wrote:
I've noticed some irregularity in job execution
through SGE over the past
few days. Currently it seems several queues are either disabled or in an
error state.
Is this expected? Is there an easy way to get an idea about how many jobs
are queued and how quickly they're executed, in other words how to predict
when a certain job might be run? Or maybe this is just a temporary issue
that'll get resolved shortly?
If an queue is in a error-state something is wrong and it needs a root or an
operator to fix this (most times just a clearing is enough). Queues that are
disabled are deactivated by purpose. I cleared the error-queues now and I will
look where the problem with mayapple is.
It is not a easy thing to get how many jobs are waiting. The reason is that
some users commit a lot of jobs that are executed with a throttle (~commit 50
jobs but do not more than 5 in parallel) – which is perfectly fine. Normally we
have enough resources that no job waits more than a few hours at maximum – but
there are exceptions.
Cheers,
Morten
Sincerely,
DaB.
--
Userpage: [[:w:de:User:DaB.]] — PGP: 0x2d3ee2d42b255885