Notification Type: PROBLEM
Service: Check systemd state
Host: cloudcontrol1003
Address: 208.80.154.23
State: CRITICAL
Date/Time: Thu May 13 16:26:57 UTC 2021
Notes URLs: https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
Acknowledged by :
Additional Info:
CRITICAL - degraded: The following units failed: trove-taskmanager.service
Notification Type: RECOVERY
Service: Check unit status of backup_vms
Host: cloudvirt1024
Address: 10.64.20.43
State: OK
Date/Time: Thu May 13 02:05:42 UTC 2021
Notes URLs: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_time…
Acknowledged by :
Additional Info:
OK: Status of the systemd unit backup_vms
Notification Type: RECOVERY
Service: configured eth
Host: cloudstore1009
Address: 208.80.155.126
State: OK
Date/Time: Wed May 12 21:43:50 UTC 2021
Notes URLs: https://wikitech.wikimedia.org/wiki/Monitoring/check_eth
Acknowledged by :
Additional Info:
OK - interfaces up
Notification Type: RECOVERY
Service: configured eth
Host: cloudstore1008
Address: 208.80.155.125
State: OK
Date/Time: Wed May 12 21:39:30 UTC 2021
Notes URLs: https://wikitech.wikimedia.org/wiki/Monitoring/check_eth
Acknowledged by :
Additional Info:
OK - interfaces up
Notification Type: PROBLEM
Service: Check unit status of backup_vms
Host: cloudvirt1024
Address: 10.64.20.43
State: CRITICAL
Date/Time: Wed May 12 05:50:53 UTC 2021
Notes URLs: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_time…
Acknowledged by :
Additional Info:
CRITICAL: Status of the systemd unit backup_vms
Notification Type: RECOVERY
Service: Check unit status of labs-ip-alias-dump
Host: cloudservices1004
Address: 208.80.154.11
State: OK
Date/Time: Tue May 11 20:38:19 UTC 2021
Notes URLs: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_time…
Acknowledged by :
Additional Info:
OK: Status of the systemd unit labs-ip-alias-dump
Notification Type: RECOVERY
Service: Check systemd state
Host: cloudservices1004
Address: 208.80.154.11
State: OK
Date/Time: Tue May 11 20:30:31 UTC 2021
Notes URLs: https://wikitech.wikimedia.org/wiki/Monitoring/check_systemd_state
Acknowledged by :
Additional Info:
OK - running: The system is fully operational
Notification Type: ACKNOWLEDGEMENT
Service: Check unit status of labs-ip-alias-dump
Host: cloudservices1004
Address: 208.80.154.11
State: CRITICAL
Date/Time: Tue May 11 19:42:42 UTC 2021
Notes URLs: https://wikitech.wikimedia.org/wiki/Analytics/Systems/Managing_systemd_time…
Acknowledged by andrew bogott: this should recover in a future run
Additional Info:
CRITICAL: Status of the systemd unit labs-ip-alias-dump