What is the content of these canary events?
Good question! I just updated EventStreams docs here https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams#Canary_Events with an answer:
The content of most canary event fields are copied directly from the first example event in the event's schema. E.g. mediawiki/recentchange example https://github.com/wikimedia/schemas-event-primary/blob/master/jsonschema/mediawiki/recentchange/1.0.1.yaml#L159 , mediawiki/revision/create example https://github.com/wikimedia/schemas-event-primary/blob/master/jsonschema/mediawiki/revision/create/2.0.0.yaml#L288. These examples can also be seen in the OpenAPI docs for the streams https://stream.wikimedia.org/?doc#/streams, e.g. mediawiki.page-move example value https://stream.wikimedia.org/?doc#/streams/get_v2_stream_mediawiki_page_move. The code that creates canary events can be found here https://gerrit.wikimedia.org/r/plugins/gitiles/wikimedia-event-utilities/%2B/refs/heads/master/eventutilities/src/main/java/org/wikimedia/eventutilities/monitoring/CanaryEventProducer.java#118. (as of 2023-11).
On Thu, Nov 9, 2023 at 11:14 AM Siddharth VP siddharthvp@gmail.com wrote:
Hi Andrew,
What is the content of these canary events? Do they have a data section or is it just the metadata? If I already have filtering to process only interesting events (say data.wiki === 'enwiki'), do I still need to add additional filtering to discard canary events?
On Thu, 9 Nov 2023 at 21:23, Andrew Otto otto@wikimedia.org wrote:
tl;dr
Ignore this email if you do not use MediaWiki event streams.
On Monday December 11 2023, all MediaWiki related event streams will have artificial canary events https://wikitech.wikimedia.org/wiki/Event_Platform/EventStreams#Canary_Events injected into them. If you use any of these streams, you should discard these canary events.
*Add code to your consumers that discards events where* meta.domain == "canary". Canary Events
At WMF, we use artificial 'canary' AKA 'heartbeat' events https://wikitech.wikimedia.org/wiki/Event_Platform/Stream_Configuration#canary_events_enabled to differentiate between a broken event stream and an empty event stream. Canary events should be produced at least once an hour. If there are no events in a stream for an hour, then something is likely broken with that stream.
These artificial canary events can be identified by the fact that their meta.domain field is set to "canary". If you use any of the streams listed below, you will need to add code that discards any events where meta.domain == "canary".
Back in 2020, we began producing canary events into all new streams, but we never got around to enabling these for streams that already existed. We needed to ensure that all consumers of these streams filtered out the canary events. We're just finally getting around to enabling canary events for all streams.
We will enable canary event production https://phabricator.wikimedia.org/T266798 for the following streams on Monday, December 11th, 2023:
- mediawiki.recentchange - mediawiki.page-create - mediawiki.page-delete - mediawiki.page-links-change - mediawiki.page-move - mediawiki.page-properties-change - mediawiki.page-restrictions-change - mediawiki.page-suppress - mediawiki.page-undelete - mediawiki.revision-create - mediawiki.revision-visibility-change - mediawiki.user-blocks-change - mediawiki.centralnotice.campaign-change - mediawiki.centralnotice.campaign-create - mediawiki.centralnotice.campaign-delete
If you consume any of these streams, either external to WMF networks using EventStreams, or internally using Kafka, please ensure that your consumer logic discards events where meta.domain == "canary" before this date. (Note that not all of these streams are exposed publicly at stream.wikimedia.org https://stream.wikimedia.org/?doc#/streams.)
Thank you,
-Andrew Otto & the WMF Data Engineering team https://wikitech.wikimedia.org/wiki/Data_Engineering
References
- T266798 - Enable canary events for all MediaWiki streams
https://phabricator.wikimedia.org/T266798
- T251609 - Automate ingestion and refinement into Hive of event data
from Kafka using stream configs and canary/heartbeat events https://phabricator.wikimedia.org/T251609
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/