Data-engineering-alerts March 2023

data-engineering-alerts@lists.wikimedia.org
  • 29 participants
  • 1279 discussions

[airflow] SLA miss on DAG=virtualpageview_hourly
by airflow-analytics@an-launcher1002.eqiad.wmnet
1 month, 4 weeks

[airflow] SLA miss on DAG=virtualpageview_hourly
by airflow-analytics@an-launcher1002.eqiad.wmnet
1 month, 4 weeks

FAIL: produce_canary_events
by SYSTEMDTIMER
1 month, 4 weeks

Unexpected found in pageview hourly pageview_allowlist_check 2023-03-30
by airflow-analytics@an-launcher1002.eqiad.wmnet
1 month, 4 weeks

[airflow] SLA miss on DAG=refine_webrequest_hourly
by airflow-analytics_test@an-test-client1001.eqiad.wmnet
1 month, 4 weeks

Data Loss ERROR - Airflow Analytics refine_webrequest_hourly 2023-03-29
by airflow-analytics_test@an-test-client1001.eqiad.wmnet
1 month, 4 weeks

** RECOVERY alert - an-launcher1002/Check systemd state is OK **
by nagios@alert1001.wikimedia.org
1 month, 4 weeks

** RECOVERY alert - kafka-jumbo1001/SSH is OK **
by nagios@alert1001.wikimedia.org
1 month, 4 weeks

** PROBLEM alert - kafka-jumbo1001/SSH is CRITICAL **
by nagios@alert1001.wikimedia.org
1 month, 4 weeks

** PROBLEM alert - an-launcher1002/Check systemd state is CRITICAL **
by nagios@alert1001.wikimedia.org
1 month, 4 weeks
Results per page: