Zwinger was out of commission for about a half hour from 22:20 UTC. Not
entirely sure what happened, but syslog entries record a number of out
of memory kills of the web server.
Symptoms included very high load, very slow response on NFS, very *very*
slow response on interactive terminals, and timeouts of attempted ssh
logins. Machine seems ok after a reboot.
Ain't single points of failure grand? :)
Gwicke's been experimenting with the coda distributed filesystem.
Hopefully it wasn't involved in the crash. :) If it works nicely, it
could be more reliable than our current center-heavy NFS system for
sharing the files to the web servers.
-- brion vibber (brion @
pobox.com)