Zwinger went down last night with some sort of disk problem. Although we've
removed most of the dependencies on Zwinger's NFS, a few had escaped noticed and
it took us a while to get things balanced right again.
* /home/wikipedia/htdocs was used via a redirect from /usr/local/apache/htdocs
(not really used for anything, just broke apache restart. this should be moved
to an empty local directory)
* /home/wikipedia/conf/php*.ini was symlinked instead of copied around, so PHP
was misconfigured until we got the files copied back.
* math renderings are on amane, but mounted through /home/wikipedia/ someplace.
this made the directory inaccessible and complicated the unmount process.
There were some other annoyances such as having /home/wikipedia/bin in the PATH.
In general having the NFS go down is a big pain in the ass to recover from; with
a lot of things hanging on it it's virtually impossible to unmount and remount
another server short of rebooting.
After rebooting things to clear up the broken mounts, we had to hassle around
fixing the above glitches and fixing some cache inconsistencies, so the
databases were locked for a couple extra hours while we fixed that.
The /home NFS server has been moved to suda; zwinger is now back online doing
mail and DNS until we get that moved off.
A few weeks ago we tried removing the /home mount from Amane to protect it
against Zwinger failures. This has been working pretty well so far, so what I'd
like to see us do is get rid of the /home mounts entirely from most servers. We
don't _really_ need them, though they're convenient when working.
This will remove the temptation to be sloppy and use files off of NFS instead of
syncing them with the other source and config directories.
The main point where this could be difficult is with the debugging and
monitoring logs we write to from the wiki; we might store those on another
server, like the upload server, or just find better NFS settings where it fails
gracefully and quickly, or something... Or we could switch to a log server of
some kind (eg syslog).
-- brion vibber (brion @
pobox.com)