[Labs-l] [Wikitech-l] Brief Labs outage

Petr Bena benapetr at gmail.com
Fri Aug 1 06:21:13 UTC 2014


No such a dashboard, we have(d) icinga and ganglia, IDK if it's
operational though... problem of these 2 is that they are maintained
primarily by puppet, which for good reasons is loathed by many and
disabled or killed on many instances, which result in these monitoring
tools being defunct there. Best option would be to create a new
interface in labs on wikitech where users could control if they want
nagios and which services they want to check.

On Fri, Aug 1, 2014 at 7:29 AM, Pine W <wiki.pine at gmail.com> wrote:
> Forwarding to Labs fof good measure.
>
> Is the a dashboard where labs users can see what tasks are currently being
> processed, task queues, system performance, etc?
>
> Thanks,
>
> Pine
>
> On Jul 31, 2014 10:21 PM, "Ori Livneh" <ori at wikimedia.org> wrote:
>>
>> (Apologies for cross-posting.)
>>
>> We've been noticing an issue with lock-ups on the beta cluster application
>> servers for the past few days. It happens about once or twice a day.
>>
>> It just happened again on both application servers, and I'd really like to
>> try and get to the bottom of things this time. I'll give up and force a
>> restart if I haven't figured it out by 6:30 UTC, about an hour from now.
>> Please accept my apology if this is disrupting your development or QA
>> work,
>> and ping me on IRC if you need Beta back up urgently.
>> _______________________________________________
>> Wikitech-l mailing list
>> Wikitech-l at lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
>
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l
>



More information about the Labs-l mailing list