[Labs-l] Tool Labs SGE outage

Russell Blau russblau at imapmail.org
Thu May 28 11:59:45 UTC 2015


Yuvi Panda <yuvipanda <at> gmail.com> writes:

> 
> It's been back and working mostly well for a while now. According to
> alerts the partial outage was from 18:33 UTC to 20:17 UTC. More
> details to follow later, here and at
> https://phabricator.wikimedia.org/T100554

This seems not to be entirely fixed. All night, I have been getting 
intermittent errors on cron jobs with the following message:

error: commlib error: access denied (server host resolves rdata host 
"tools-submit.eqiad.wmflabs" as "(HOST_NOT_RESOLVABLE)")

Curiously, not all grid jobs fail in this way; some of them have been 
running successfully, but without any apparent pattern.





More information about the Labs-l mailing list