Hello,
There will be a couple of brief interruptions to some the Data Platform services this Wednesday and Thursday, as we are supporting SRE Infrastructure Foundations with some of their work to upgrade the network switches in T348977.
Specifically, we need to perform a role swap of our two Analytics_Meta
database servers, which serve Hive, Druid, DataHub and Superset.
The roles will be swapped on Wednesday at around 10:00 UTC and
swapped back on Thursday at around 10:00 UTC. On each occasion,
there will be a brief period where the databases are made
read-only, while the replication roles are swapped and the
application configuration is updated. This may result in errors if
you are actively using any of the applications at the time.
In order to minimize the chance of data processing errors, I will
also be pausing ingestion to the data lake around 1 hour before
each role swap, so that data pipelines do not try to write to Hive
or ingest to Druid. Therefore you may also notice a delay for data
to arrive in HDFS, Hive, and Druid, but this shouldn't be more
than an hour or so.
If you have any queries or concerns, please don't hesitate to let us know by reply to this email, or on #data-engineering-collab on Slack, or #wikimedia-analytics on IRC.
Kind regards,
Ben
![]() |
Ben Tullis (he/him) Senior Site Reliability Engineer Wikimedia Foundation |