Data-engineering-alerts February 2023

data-engineering-alerts@lists.wikimedia.org
  • 28 participants
  • 1514 discussions

[airflow] SLA miss on DAG=aqs_hourly
by airflow-analytics_test@an-test-client1001.eqiad.wmnet
3 months

** RECOVERY alert - an-launcher1002/Check systemd state is OK **
by nagios@alert1001.wikimedia.org
3 months

** PROBLEM alert - an-launcher1002/Check systemd state is CRITICAL **
by nagios@alert1001.wikimedia.org
3 months

FAIL: produce_canary_events
by SYSTEMDTIMER
3 months

** RECOVERY alert - an-launcher1002/Check systemd state is OK **
by nagios@alert1001.wikimedia.org
3 months

** RECOVERY alert - an-worker1132/Check systemd state is OK **
by nagios@alert1001.wikimedia.org
3 months

** PROBLEM alert - an-worker1132/Check systemd state is CRITICAL **
by nagios@alert1001.wikimedia.org
3 months

** PROBLEM alert - an-worker1132/Check systemd state is CRITICAL **
by nagios@alert1001.wikimedia.org
3 months

Refine failures for job refine_eventlogging_analytics
by refine@an-worker1148.eqiad.wmnet
3 months

Refine failures for job refine_eventlogging_analytics
by refine@an-worker1119.eqiad.wmnet
3 months
Results per page: