[Labs-l] Queue master is dead and Incinga question

Sorawee Porncharoenwase nullzero.free at gmail.com
Sun Dec 1 10:02:14 UTC 2013


   1. Unable to connect to Bastion.
   2. Grid status table (https://tools.wmflabs.org/?status) shows nothing.
   3. Got [error: commlib error: got select error (No route to host)
   error: unable to contact qmaster using port 6444 on host
   "tools-master.pmtpa.wmflabs"] after qstat
   4. Service Temporarily Unavailable for some tools (see
   http://lists.wikimedia.org/pipermail/labs-l/2013-December/001892.html)

Sorawee


On Sun, Dec 1, 2013 at 4:54 AM, Bryan White <bgwhite at gmail.com> wrote:

> Unable to submit jobs as the queue master (tools-master.pmtpa.wmflabs)
> is dead as of ~8z. I did report it via Bugzilla.
>
> I see the Incinga page (http://icinga.wmflabs.org/icinga/) is
> reporting that the master is down.  Assuming the powers that be get
> email/messages of this, does it need to be reported via bugzilla?
>
> Would it be a good idea that if some major services go down, Incinga
> would email this list or similar lab list?
>
> (sniff, Incinga/Nagios will always be netsaint for me.  I will never
> miss Nagios texting me at 3 in the morning and having to fix things.
> You have my sympathies for those that do.)
>
> Bryan
>
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
>



-- 
Sorawee Porncharoenwase
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.wikimedia.org/pipermail/labs-l/attachments/20131201/c67f1cf4/attachment.html>


More information about the Labs-l mailing list