No such a dashboard, we have(d) icinga and ganglia, IDK if it's
operational though... problem of these 2 is that they are maintained
primarily by puppet, which for good reasons is loathed by many and
disabled or killed on many instances, which result in these monitoring
tools being defunct there. Best option would be to create a new
interface in labs on wikitech where users could control if they want
nagios and which services they want to check.
On Fri, Aug 1, 2014 at 7:29 AM, Pine W <wiki.pine(a)gmail.com> wrote:
Forwarding to Labs fof good measure.
Is the a dashboard where labs users can see what tasks are currently being
processed, task queues, system performance, etc?
Thanks,
Pine
On Jul 31, 2014 10:21 PM, "Ori Livneh" <ori(a)wikimedia.org> wrote:
(Apologies for cross-posting.)
We've been noticing an issue with lock-ups on the beta cluster application
servers for the past few days. It happens about once or twice a day.
It just happened again on both application servers, and I'd really like to
try and get to the bottom of things this time. I'll give up and force a
restart if I haven't figured it out by 6:30 UTC, about an hour from now.
Please accept my apology if this is disrupting your development or QA
work,
and ping me on IRC if you need Beta back up urgently.
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Labs-l mailing list
Labs-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/labs-l