[Labs-l] NFS issues, please deploy fire

Ryan Lane rlane at wikimedia.org
Mon Sep 17 17:18:29 UTC 2012


> Aside from the known issues with MySQL I've not seen any issues with
> gluster on project storage.
>

Good.

> Aside from the nfs instance being rather hammered on cpu/sluggish (which
> will probably only get worse as more people use it), if it dies then
> logins are broken for everyone.
>
> I wouldn't be surprised if there is still 'production' stuff running
> from there rather than project storage, combined with work-in-progress
> stuff hovering around people's home dirs it's potentially a mess and a
> chunk of lost data/time/effort.
>
> Yes people should take backups, use version control etc, but I think we
> should at least attempt to provide redundancy where possible (gluster)
> and minimize impact for end users when things break (which is going to
> happen at some point, especially when people are involved).
>

People shouldn't use home directories for anything other than
environment. Home directories are personal by definition, and the idea
is that we're supposed to be working collaboratively. Put data,
scripts, logs, etc. in project storage. If everyone was doing this
right now the nfs instance wouldn't be a massive issue.

> Things breaking/disappearing/being slow just makes people go elsewhere,
> which breaks the community up and drives everything in the opposite
> direction to where it should be heading.
>

Indeed. Of course, the NFS instance hasn't actually gone down in a
couple months, but it has been slow. It needs to get switched out. The
work simply needs to be done for it.

Time is the issue. The majority of the Labs team is currently working
on productions issues: Faidon is working on Swift, I'm working on the
new deployment system, Andrew is working on bringing OpenStack plugins
into our deployment. We'll need to set aside some time to take care of
this, but production generally gets priority.

Here's what we need to do to switch the home directories:

1. Enable pam_mkhomedir or autodir
2. Move the current data
3. Create a global read-only gluster share for the authorized_keys files
4. Update LDAP for autofs

- Ryan



More information about the Labs-l mailing list