> Date: Mon, 17 Dec 2007 00:19:28 +0000
> From: River Tarnell <river(a)wikimedia.org>
> Subject: Re: [Toolserver-l] downtime
> To: toolserver-l(a)lists.wikimedia.org
> Message-ID: <4765C090.1000808(a)wikimedia.org>
> Content-Type: text/plain; charset="iso-8859-1"
>
> the maintenance is finished now. the problem was caused by
> filesystem corruption on clematis:/aux0, the QFS filesystem
> where hemlock's /home is currently mounted from, when the
> connection to the iSCSI array was broken. i have replaced
> this filesystem with a VxFS filesystem, which should be more
> resiliant against problems like this.
>
> the iSCSI problem was my fault, so sorry for that. to
> prevent it happening in the future, i've asked for the array
> to be connected directly to clematis's NIC, which should be
> more reliable. in the longer term the plan is to move
> hemlock's /home back to a local array; this should happen
> either this month or next, when a new array is installed at knams.
>
> a very small number of files were unrecoverable from the
> damaged /home.
> if any of your files are missing, mail ts-admins or file a
> bug and they can be restored from a backup.
>
> - river.
river - thanks for your efforts to get things back on the air. Much
appreciated.
Do you have a list of what files were unrecoverable?
I suppose we all, as good developer hygiene, should maintain a list of
everything we have (and an offsite backup) but .. :)
Larry Pieniazek
Hobby mail: Lar at Miltontrainworks dot com