[Labs-l] Status of the filesystem

Marc A. Pelletier marc at uberbox.org
Tue Oct 15 13:45:15 UTC 2013


So, a status update:

Labs NFS is now on new hardware, and functions properly without the disk
controller stalls that plagued the previous server.  (yeay!)

On the down side, because we were not completely sure whether the
problem was caused by the controller being faulty or a regression in the
driver, the new install was (purposefully) very paranoid and downgraded
the kernel to 3.2, removing a few features as a side effect and causing
one unanticipated problem: change of file ownership no longer works
properly, even for root[1], meaning that any new tool account requires a
manual intervention and take no longer works.

That problem can be fixed in two ways; either we upgrade the kernel back
to the version that has proper support for our setup or we make a change
in the way service groups are setup which we have been intending to do
for a while.

The former is a low-impact change that does not require rebooting any
instances, and is probably going to be the first thing tried.  With a
bit of luck on our side, that'll fix the issue with no disruption.

The change in service group setup is on our roadmap /anyways/ since that
will fix a number of (mostly invisible to labs) limitations and problems
in our infrastructure; but if we are able to we will wait until we move
labs to our primary data center to do it so as to minimize disruption.

More news to come,

-- Marc

[1] For the curious, because usernames and user ID do not match between
projects, we have to use UID-based security with NFS4 rather than the
default principal-based one, something that kernel versions before 3.5
only partially supported.  It works, but fails to recognize UID 0 as
superuser.




More information about the Labs-l mailing list