- "Whaddya say we try that again, huh?"
- "Yes, yes. Yes. Without the oops."
So, now that I have power and internet again, a reschedule for tomorrow
(Thursday) at the same time:
=== Planned outage ===
When: Thursday April 25 at 18:00 UTC
Duration: 1 hour
Impact:
* Jobs running on the grid engine will be stopped, and execution nodes
will be temporarily disabled;
* The login server will be restarted during the window, ending active
sessions;
* The web service will be unavailable during the maintenance window; and
* Running processes not scheduled through the grid engine will be killed.
Recovery plan:
In case of unplanned failure during the maintenance window,
configuration will be rolled back to the current version (that is, the
gluster-based project storage will remain in place) and a new
window will be planned after postmortem.
-- Marc
Show replies by date
Hello again,
The maintenance has concluded successfully within the designated, and
the Tool Labs instance now use the new NFS server for shared filesystems.
This doubled as a hard test of the continuous bot start/restart system,
since the entire cluster was disabled for rolling periods during the
maintenance, and the filesystem on which the actual tools were running
has been switch underneath them -- pretty much a worst case scenario to
recover from.
The result is that all but one tool that had been started as a
continuous process restarted cleanly and automatically as the cluster
returned to function with the new filesystem (the tool that did not
failed to return for an unrelated reason).
Thank you all for your patience!
-- Marc