Thanks to last-minute intervention by Jaime Crespo, toolsdb is back to working as normal.  Some context can be found at https://phabricator.wikimedia.org/T236384

-Andrew + wmcs team


On 10/24/19 10:23 AM, Andrew Bogott wrote:
An entirely surprising side-effect of this maintenance is causing chronic database instability.  We're working to resolve this but in the meantime the tools database server is likely to be up and down several times.  We'll update once things are stable again.

Sorry for the (ongoing) interruption!

-Andrew + wmcs team



On 10/21/19 2:49 PM, Brooke Storm wrote:
With a redundant power supply upgrade going on this week in the datacenter that could affect the VM that Toolsdb runs on, we anticipate a brief outage Thursday 10/24 @11am UTC of the mysql service to protect data in case anything goes wrong. This may require a restart of a tool to reconnect to the database. We do not anticipate any worse disruptions, but if there is any disruption beyond what is planned, a failover may be necessary, which will not include the non-replicated tables mentioned here https://wikitech.wikimedia.org/wiki/Help:Toolforge/Database#ToolsDB_Backups_and_Replication 

The maintenance requiring this notice and action is detailed here https://phabricator.wikimedia.org/T227540.  The VM resides on the cloudvirt1019 hypervisor, which is why it is in scope.

We sincerely apologize for the short notice.

Brooke Storm
Senior SRE
Wikimedia Cloud Services
IRC: bstorm_


_______________________________________________
Wikimedia Cloud Services announce mailing list
Cloud-announce@lists.wikimedia.org (formerly labs-announce@lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/cloud-announce