Le 10/04/13 00:34, Antoine Musso a écrit :
On Friday April 12th at 13:00GMT the Wikimedia operations team is going to add two SSD disks on the gallium server. The downtime is expected to last up to two hours.
Plan of action:
- server is bought down and disks are added, server restarted (est: 30min)
- Zuul/Jenkins restart (est: up to 45 mins)
The internal reference for this operation is RT #4916.
Side effect:
gallium hosts both Zuul and Jenkins. The two services will thus not be able to fulfil their duty such as running lint checks and unit tests on submitted changes. If you really need a change to be merged, you will have to locally test your change and submit it manually.
Any changes submitted during the maintenance window would need to be retriggered, the easiest way is to either rebase them or do a tiny edit to the commit message. That will produce a new patchset that will in turns triggers tests as usual.
Note: We choose Friday because of ops availability in the datacenter and the lack of software deployment on this day. That is also my late day so I will be around for the next few hours after the maintenance.
Jenkins is back up and operational. It has been configured to points the job workspaces on the SSD drive which should speed up the builds a bit.
For reference the workspace root is /srv/ssd/jenkins .
Thank you Chris Johnson to have made it possible in a timely manner!