Hello,

There will be a couple of brief interruptions to some the Data Platform services this Wednesday and Thursday, as we are supporting SRE Infrastructure Foundations with some of their work to upgrade the network switches in T348977.

Specifically, we need to perform a role swap of our two Analytics_Meta database servers, which serve Hive, Druid, DataHub and Superset. The roles will be swapped on Wednesday at around 10:00 UTC and swapped back on Thursday at around 10:00 UTC. On each occasion, there will be a brief period where the databases are made read-only, while the replication roles are swapped and the application configuration is updated. This may result in errors if you are actively using any of the applications at the time.

In order to minimize the chance of data processing errors, I will also be pausing ingestion to the data lake around 1 hour before each role swap, so that data pipelines do not try to write to Hive or ingest to Druid. Therefore you may also notice a delay for data to arrive in HDFS, Hive, and Druid, but this shouldn't be more than an hour or so.

If you have any queries or concerns, please don't hesitate to let us know by reply to this email, or on #data-engineering-collab on Slack, or #wikimedia-analytics on IRC.

Kind regards,
Ben

--
Ben Tullis (he/him)
Senior Site Reliability Engineer
Wikimedia Foundation