[Labs-l] NFS issues

Petr Bena benapetr at gmail.com
Fri Jul 19 08:16:25 UTC 2013


/me slaps Coren for not using SAL which is where I look first when I
see rebooted server :P

On Thu, Jul 18, 2013 at 11:06 PM, Ken Snider <ksnider at wikimedia.org> wrote:
> FYI. Thanks for the update, Marc!
>
>
> --Ken.
>
> (Sent from iPhone)
>
> On 2013-07-18, at 1:13 PM, "Marc A. Pelletier" <marc at uberbox.org> wrote:
>
>> Some of you may have noticed some annoyance with the NFS filesystems
>> lately.  While we seem to have successfully solved the problem that had
>> it crash completely every 14 days, there is a lingering issue with the
>> controller on the file server that causes intermittent stalls in the
>> disk IO.
>>
>> In practice, this should have no impact on your running tools (or
>> interactive session) except for disk access "freezing" for periods of
>> 2-3 minutes at irregular intervals.  The amount of stalls seem to be
>> related to write traffic, but never gets much worse than 2-3 times per
>> hours (annoying though they be).
>>
>> In an attempt to solve the issue this afternoon, I tweaked some driver
>> settings on the file server but accidentally brought the filesystems
>> back up in the wrong order, making files appearing unavailable for a
>> brief period (12s) and necessitating a reboot of the Tool Labs cluster.
>>
>> Sadly, this was in vain since the underlying issue remains.  It is not
>> yet clear if the issue is caused by the driver or a hardware problem,
>> but my efforts remain focused on solving the issue for good.
>>
>> In the meantime, I thank you for your patience as performance remains
>> impacted.
>>
>> -- Marc
>>
>> _______________________________________________
>> Labs-l mailing list
>> Labs-l at lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/labs-l
>
> _______________________________________________
> Labs-l mailing list
> Labs-l at lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/labs-l



More information about the Labs-l mailing list