[Labs-l] queue down (for real)
Andrew Bogott
abogott at wikimedia.org
Wed Dec 30 02:22:02 UTC 2015
On 12/29/15 7:01 PM, Bryan White wrote:
> nothing of mine has run on the queue for ~90 minutes.
>
> Output of 'qstat -f'
> error: commlib error: got select error (Connection refused)
> error: unable to send message to qmaster using port 6444 on host
> "tools-grid-master.tools.eqiad.wmflabs": got send error
>
12000 or so jobs were scheduled over the course of about 90 minutes and
the grid is overwhelmed -- we're working on untangling the mess.
> Bryan
>
>
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-l/attachments/20151229/e2006676/attachment.html>
More information about the Labs-l
mailing list