Hi all,
tl;dr On Monday August 6 we are making EventStreams multi-DC, and this should be transparent to users.
Due to a recent outage https://wikitech.wikimedia.org/wiki/Incident_documentation/20180711-kafka-eqiad of the our main eqiad Kafka cluster, we want to make the EventStreams service support multiple datacenters for better high availability. To do so, we need to hide the Kafka cluster message offsets from the SSE/EventSource clients. On Monday August 6th, we will deploy a change to EventStreams that will make it use message timestamps instead of message offsets in the SSE/EventSource id field that is returned for every received message. This will allow EventStreams to be backed by any Kafka cluster, with auto-resuming during reconnect based on timestamp instead of Kafka cluster based logical offsets.
This deployment should be transparent to clients. SSE/EventSource clients will reconnect automatically and begin to use timestamps instead of offsets in the Last-Event-ID.
You can read more about this work here: https://phabricator.wikimedia.org/T199433
- Andrew Otto, Systems Engineer, WMF