[Labs-l] Tool Labs SGE outage
Russell Blau
russblau at imapmail.org
Thu May 28 11:59:45 UTC 2015
Yuvi Panda <yuvipanda <at> gmail.com> writes:
>
> It's been back and working mostly well for a while now. According to
> alerts the partial outage was from 18:33 UTC to 20:17 UTC. More
> details to follow later, here and at
> https://phabricator.wikimedia.org/T100554
This seems not to be entirely fixed. All night, I have been getting
intermittent errors on cron jobs with the following message:
error: commlib error: access denied (server host resolves rdata host
"tools-submit.eqiad.wmflabs" as "(HOST_NOT_RESOLVABLE)")
Curiously, not all grid jobs fail in this way; some of them have been
running successfully, but without any apparent pattern.
More information about the Labs-l
mailing list