Is there some heartbeat monitoring system in the Wikimedia cloud?
I’m running a periodic cronjob (on toolforge/kubernetes) that I’d like to monitor. Ideally, my cronjob could send a heartbeat to some monitoring system for each successful run. For example, it would send an HTTP POST request to monitoring.wmcloud.org/mycronjob or whatever. When no heartbeat has been received for >3 weeks, I’d like to receive an email alert to some configured email address. Does Wikimedia operate some heartbeat monitoring system like this?
Best,
— Sascha
I personally use healthchecks.io[1] myself but I'm not sure if it gets rate limited with the same IP hitting the external services. (I once considered asking for self-hosted options made available at labs but if wikimedia network is down it won't be sending out notifications, so I decided not to.
[1]: https://healthchecks.io/docs/
---- revi | 레비 he/him In this Korean name, the family name is Hong. https://revi.omg.lol 나의 iPhone에서 보냄
- 18:25, Sascha Brawer sascha@brawer.ch 작성:
Is there some heartbeat monitoring system in the Wikimedia cloud?
I’m running a periodic cronjob (on toolforge/kubernetes) that I’d like to monitor. Ideally, my cronjob could send a heartbeat to some monitoring system for each successful run. For example, it would send an HTTP POST request to monitoring.wmcloud.org/mycronjob or whatever. When no heartbeat has been received for >3 weeks, I’d like to receive an email alert to some configured email address. Does Wikimedia operate some heartbeat monitoring system like this?
Best,
— Sascha
Cloud mailing list -- cloud@lists.wikimedia.org List information: https://lists.wikimedia.org/postorius/lists/cloud.lists.wikimedia.org/
On Thu, 14 Apr 2022 at 19:41, revi lists@revi.email wrote:
I personally use healthchecks.io[1] myself but I'm not sure if it gets rate limited with the same IP hitting the external services. (I once considered asking for self-hosted options made available at labs but if wikimedia network is down it won't be sending out notifications, so I decided not to.
This is something we could host on WMES server (outside of WMF networks) for the benefit of the general community.
This is something we could host on WMES server (outside of WMF networks) for the benefit of the general community. https://github.com/healthchecks/healthchecks/
That would be great imho; I’ve filed a ticket [https://phabricator.wikimedia.org/T306790].
— Sascha