[Labs-l] NFS outage [FIXED(?)]

Marc A. Pelletier marc at uberbox.org
Mon Jul 15 20:44:55 UTC 2013


So, 14 days after the previous reboot of labstore3, the same symptoms
occurred again today around 18h UTC.

This time, however, we found a likely culprit from a known bug in the
RPC scheduler[1] that paravoid confirmed by dumping a stack trace of the
live system before we rebooted it.  The patch was applied to the 3.8
kernel tree, so we upgraded labstore3 to linux-image-3.8.0-26-generic
before rebooting.

The NFS server is operational again at this time.

Provided this fixes the bug (which is almost certain given the stack
traces), we will return the tunables we had previously changed to try to
isolate the problem to their more performing values in two weeks.

Thank you all for your patience,

-- Marc



More information about the Labs-l mailing list