Hi,
between 5:30 and 7:10 UTC on 2023/09/27 the WDQS servers running in the eqiad datacenter have returned results with more than 10 minutes of lag. This caused bots using MW maxlag [0] to stop functioning properly during that time.
The incident started after a failure of the mirroring system between two of our kafka clusters [1], such incident should not have impacted WDQS but it uncovered improper sandboxing of the WDQS updater test setup [2].
Sorry for the inconvenience.
--
David Causse
Software Engineer, Wikimedia Foundation