[Labs-l] Filesystem maintenance aborted
Marc A. Pelletier
marc at uberbox.org
Thu Jan 15 20:09:10 UTC 2015
Hello Labs,
The maintenance, due today, was started then aborted after two hours
since only roughly 2% of the necessary copy was done after that interval
- which might have caused the partial outage to last well over four days.
The unexpected lack of performance was caused by the fact that labs
storage does not currently have sufficient elbow room to make a
duplicate of the data over a contiguous area of the disk array - causing
performance much lower than that was observed during testing.
We have a new storage shelf on order that should be put in production
fairly soon (weeks); rather than add the storage this provides
immediately, I'll be able to use it to make an offline copy of the Labs
storage /prior/ to the next attempt at switching the filesystems over to
the new scheme - which I will schedule some time in the future.
The existing filesystem behaved as expected and was properly readonly
during the two hours of partial outage, and has now been restored to
full read-write.
In the meantime, there should be no lasting effect from the partial
outage - in particular, the notes about existing open files becoming
stale is not applicable since the filesystem was not switched. No tool
or service that was not otherwise affected by the readonly filesystem
needs to be restarted.
-- Marc
More information about the Labs-l
mailing list