[Labs-announce] Disruptive Tools NFS maintenance on 11/2/2016

Madhumitha Viswanathan mviswanathan at wikimedia.org
Fri Oct 21 18:00:13 UTC 2016


As the next step in our storage redundancy and reliability efforts for
Labs, we have a significant migration coming up on 11/2 starting 08:00
PST(15:00 UTC) involving the tools NFS share. The maintenance window can be
up to 48h long, and will affect most running tools. At the end of the
migration, everything (except transient jobs) should ideally be working the
same way as they were before the migration, but better.

Here's what to expect during the maintenance window:

* The tools NFS share (/data/project and /home) will be read-only for the
duration of the maintenance, so no new data or logs will get written to it.
* New jobs cannot be submitted for the whole maintenance window - this
means submitting jobs through cron or tools-mail will not function,
although tools-mail can continue to send emails.
* Current jobs might keep running, but won't get rescheduled if they die.
If they do not die and aren't writing to NFS they should be fine.
* All exec nodes will get depooled, rebooted and repooled and jobs that
don't get rescheduled automatically will have died and need manual restarts.

Do let us know if you have any questions or concerns on the lists or on
#wikimedia-labs.

-- 
Madhumitha Viswanathan
Operations Engineer, Wikimedia Labs
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.wikimedia.org/pipermail/labs-announce/attachments/20161021/52388582/attachment.html>


More information about the Labs-announce mailing list