Tomasz Finc wrote:
All worker threads were stuck on the write()
system call as NFS had
started to flap around the time of our outage.
The dumps are working from a worker servers pool which are writing to
storage via NFS?
It may be more efficient to save locally and transfer asynchronously to
the storage node.
It turns out that NFS is likely not the root cause of the issue. We've
been debugging it in bug #23264 as we make progress.
-tomasz