The reboots are now done and everything is upgraded. So far things seem
back to normal, but visit us in #wikimedia-cloud if you find things amiss.
-Andrew (+ WMCS team)
On 1/16/18 8:57 AM, Andrew Bogott wrote:
The canary reboots last week went well, so we'll be upgrading and
rebooting the rest of the cloud over the course of the day today,
beginning in a few minutes.
As always, we'll do our best to minimize effects within toolforge,
although it's always a good idea to make sure your jobs are still
running after windows like this. The list of VMs from last week
(attached below) are already good to go so they should be unaffected
On 1/11/18 3:15 PM, Andrew Bogott wrote:
Today's round of reboots is now finished --
the hosts rebooted are
One correction: Monday is a holiday, so we're planning to reboot the
rest of the fleet on Tuesday, January 16th. Any VMs not in the list
below should anticipate downtime at some point on Tuesday.
On 1/11/18 1:02 PM, Andrew Bogott wrote:
In a few minutes I'm going to start the first
round of reboots.
We're going to do a subset of the cloud and then make sure there are
no bad effects before doing the remainder on Monday.
The following VMs will be upgraded and rebooted over the next few
On 1/4/18 9:28 AM, Andrew Bogott wrote:
Sometime soon (probably in the next day or two)
we will be applying
kernel patches to all VMs and physical hosts in WMCS. This is to
address an urgent security issue , so we'll be skipping the
traditional 7-day warning period -- basically as soon as proper
fixes are available we'll start patching and rebooting.
As usual, we'll do our best to re-balance Toolforge grid nodes, so
impact on Toolforge users should be minimal (worst case you may
need to manually restart interrupted tasks).
For other users: if your VPS project requires special handling or
specific notice about when a particular VM will reboot, please add
a subtask describing your need to