TL;DR: * c1.labsdb (labsdb1001.eqiad.wmnet) is down due to hardware issues * *.labsdb are pointing to c3.labsdb (labsdb1003.eqiad.wmnet)
The physical server behind c1.labsdb (labsdb1001.eqiad.wmnet) experienced a hard drive failure around 2017-11-01T03:30 UTC. This failure is preventing the MySQL service on that host from starting. The *.labsdb service names that were pointed at that server have been updated to point to c3.labsdb (labsdb1003.eqiad.wmnet) instead.
See https://phabricator.wikimedia.org/T179464 for more information and additional updates.
Expect slower than normal performance as all traffic is handled by a single server. Now would be a great time to update the configuration for your tools to use the new database cluster [0][1].
[0]: https://phabricator.wikimedia.org/phame/post/view/70/new_wiki_replica_server... [1]: https://wikitech.wikimedia.org/wiki/Wiki_Replica_c1_and_c3_shutdown
Bryan
On Wed, Nov 1, 2017 at 9:42 AM, Bryan Davis bd808@wikimedia.org wrote:
TL;DR:
- c1.labsdb (labsdb1001.eqiad.wmnet) is down due to hardware issues
- *.labsdb are pointing to c3.labsdb (labsdb1003.eqiad.wmnet)
Manuel (one of our awesome DBAs) was able to get the MySQL server for c1.labsdb back up and running in read-only mode.
We do not know how much longer the failing SSD drive will survive, but until it fails again, users who have user created databases on this host can attempt to connect and archive them. If the data you have stored there is something that you can recreate, you should probably do that instead.
Please move non-reproducible data to tools.db.svc.eqiad.wmflabs. c3.labsdb (labsdb1003.eqiad.wmnet) will be shutdown on Wednesday 13 December 2017, so moving any data to c3 will gain you less than 6 weeks. You should also be working to update your tools that need to write data to a database to use tools.db.svc.eqiad.wmflabs. See https://wikitech.wikimedia.org/wiki/Wiki_Replica_c1_and_c3_shutdown for more details.
Due to the failure of labsdb1001.eqiad.wmnet, the planned reboot of labsdb1003.eqiad.wmnet on Tuesday 07 November 2017 has been cancelled.
Bryan (on behalf of the Cloud Services and DBA teams)
cloud-announce@lists.wikimedia.org