Brion Vibber wrote:
The image fileserver is currently a potential problem, as the application servers use NFS to manipulate files on it. NFS is notoriusly tempermental, and if the server goes down it tends to hang for long periods of time, with similar problem results.
Well, I'll stop by and ask the fellow in charge of this for the Linux kernel, possibly tomorrow, or maybe 10am donuts on Weds.
Improvements to this could include minimizing our contact with the file server (avoid unnecessary reads and checks for file existence; we've got a damn database) and potentially using some more explicit file upload protocol which can fail gracefully.
Both sound like good ideas.