Hello Cloud Admins!
As part of https://phabricator.wikimedia.org/T174569 we have to alter some big tables. One of them is logging, which, for instance, in wikidata takes around 8h. Which is the shard I am currently working on.
Because of the nature of the change (some columns being added) and ROW based replication (what we use in sanitariums) this change needs to be done with replication (from sanitarium, or their masters, to the labs servers).
This will obviously generate lag and if not done that way, it will break replication till the column is added on the labs hosts, and this is less desirable than replication lag.
I am planning to run the alter probably tomorrow or Monday (I will notify when I start it) for the sanitarium host in s5, that means that there will be lag on the labs servers, for a few hours, on the s5 instance (which will also affect s1 and s3 because we are using the same replication thread for those shards too - which is a FIXME we have pending).
s2, s4, s6 and s7 will remain unaffected as they have their own replication thread.
Should you have any questions, let me know!
Thanks Manuel.