Data-engineering-alerts November 2022

data-engineering-alerts@lists.wikimedia.org
  • 26 participants
  • 726 discussions

** RECOVERY alert - an-launcher1002/Check systemd state is OK **
by nagios@alert1001.wikimedia.org
6 months

** PROBLEM alert - an-launcher1002/Check systemd state is CRITICAL **
by nagios@alert1001.wikimedia.org
6 months

Output of systemd timer for '/usr/local/bin/kerberos-run-command analytics /usr/local/bin/produce_canary_events'
by SYSTEMDTIMER
6 months

Refine failures for job refine_event
by refine@an-worker1102.eqiad.wmnet
6 months

OOZIE - SLA END_MISS (AppName=webrequest-druid-hourly-coord, JobID=0068485-220913162928808-oozie-oozi-C@123)
by oozie@an-coord1001.eqiad.wmnet
6 months

OOZIE - SLA END_MISS (AppName=webrequest-druid-hourly-coord, JobID=0068485-220913162928808-oozie-oozi-C@122)
by oozie@an-coord1001.eqiad.wmnet
6 months

OOZIE - SLA END_MISS (AppName=webrequest-druid-daily-coord, JobID=0068490-220913162928808-oozie-oozi-C@6)
by oozie@an-coord1001.eqiad.wmnet
6 months

RECOVERY Host an-worker1089 - PING OK - Packet loss = 0%, RTA = 0.28 ms
by nagios@alert1001.wikimedia.org
6 months

RECOVERY Host an-worker1094.mgmt - PING OK - Packet loss = 0%, RTA = 0.64 ms
by nagios@alert1001.wikimedia.org
6 months

PROBLEM Host an-worker1094.mgmt - PING CRITICAL - Packet loss = 100%
by nagios@alert1001.wikimedia.org
6 months
Results per page: