We have completed all of the updates and reboots for the hypervisors and instances in https://phabricator.wikimedia.org/T184910, but there are more maintenance events that are less invasive to come. This is being tracked in https://phabricator.wikimedia.org/T184910
Most of this will be handled gracefully without user impact, but not all of it.
We will reboot the `dumps` NFS server that also provides the `maps` and `scratch` NFS shares tomorrow (1/18/2018). Note the reason this is an outage event is that this server is a single point of failure. Efforts to improve this are happening in https://phabricator.wikimedia.org/T168486.
More announcements will come for maintenance that is impactful.
As part of a security upgrade, I'll be rebooting the systems that host Wikitech and Horizon in about two hours, at 14:00 PST (16:00 CST).
Those websites will be briefly unavailable, as will be the Nova api. This last will cause a brief interruption to the WMF Continuous Integration system. The total downtime should not exceed 10 minutes.
Sorry for the interruption!
-Andrew
On 1/17/18 2:19 PM, Andrew Bogott wrote:
As part of a security upgrade, I'll be rebooting the systems that host Wikitech and Horizon in about two hours, at 14:00 PST (16:00 CST).
These reboots are done and everything is back up. Sorry for any inconvenience caused!
-Andrew
Those websites will be briefly unavailable, as will be the Nova api. This last will cause a brief interruption to the WMF Continuous Integration system. The total downtime should not exceed 10 minutes.
Sorry for the interruption!
-Andrew