On 09/05/13 18:57, Tim Landscheidt wrote:
Marlen Caemmerer marlen.caemmerer@wikimedia.de wrote:
I would like to reboot ortelius, one of the web servers at
tomorrow, Tuesday 1830 UTC
Apparently, wolfsbane rebooted today as well:
| timl@wolfsbane:~$ uptime | 16:49pm up 5:00, 2 users, load average: 1.16, 1.24, 1.47 | timl@wolfsbane:~$
Perhaps related to that, SGE queues on ortelius and wolfs- bane are in state "au" (alarm, unknown):
Yes, sge_execd seems not to be running on them.
Plus medium and longrun queues in yarrow are in error state. I tried cleaning them, but they failed again.