> What is the content of these canary events?
Good question! I just updated EventStreams docs here with an answer:

The content of most canary event fields are copied directly from the first example event in the event's schema. E.g. mediawiki/recentchange examplemediawiki/revision/create example. These examples can also be seen in the OpenAPI docs for the streams, e.g. mediawiki.page-move example value. The code that creates canary events can be found here (as of 2023-11).



On Thu, Nov 9, 2023 at 11:14 AM Siddharth VP <siddharthvp@gmail.com> wrote:
Hi Andrew,

What is the content of these canary events? Do they have a data section or is it just the metadata? If I already have filtering to process only interesting events (say data.wiki === 'enwiki'), do I still need to add additional filtering to discard canary events?

On Thu, 9 Nov 2023 at 21:23, Andrew Otto <otto@wikimedia.org> wrote:
tl;dr 

Ignore this email if you do not use MediaWiki event streams.


On Monday December 11 2023, all MediaWiki related event streams will have artificial canary events injected into them.  If you use any of these streams, you should discard these canary events.


Add code to your consumers that discards events where meta.domain == "canary".

Canary Events

At WMF, we use artificial 'canary' AKA 'heartbeat' events to differentiate between a broken event stream and an empty event stream.  Canary events should be produced at least once an hour.  If there are no events in a stream for an hour, then something is likely broken with that stream.


These artificial canary events can be identified by the fact that their meta.domain field is set to "canary".  If you use any of the streams listed below, you will need to add code that discards any events where meta.domain == "canary".


Back in 2020, we began producing canary events into all new streams, but we never got around to enabling these for streams that already existed.  We needed to ensure that all consumers of these streams filtered out the canary events.  We're just finally getting around to enabling canary events for all streams.


We will enable canary event production for the following streams on Monday, December 11th, 2023:


    - mediawiki.recentchange

    - mediawiki.page-create

    - mediawiki.page-delete

    - mediawiki.page-links-change

    - mediawiki.page-move

    - mediawiki.page-properties-change

    - mediawiki.page-restrictions-change

    - mediawiki.page-suppress

    - mediawiki.page-undelete

    - mediawiki.revision-create

    - mediawiki.revision-visibility-change

    - mediawiki.user-blocks-change

    - mediawiki.centralnotice.campaign-change

    - mediawiki.centralnotice.campaign-create

    - mediawiki.centralnotice.campaign-delete


If you consume any of these streams, either external to WMF networks using EventStreams, or internally using Kafka, please ensure that your consumer logic discards events where meta.domain == "canary" before this date. (Note that not all of these streams are exposed publicly at stream.wikimedia.org.)


Thank you,

-Andrew Otto & the WMF Data Engineering team


References

- T266798 - Enable canary events for all MediaWiki streams

- T251609 - Automate ingestion and refinement into Hive of event data from Kafka using stream configs and canary/heartbeat events


_______________________________________________
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
_______________________________________________
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-leave@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/