Hello all,
great parts of the toolserver-cluster were down or very slow in the last few hours. AFAIS it was a problem with the user-store or rosemary (where the user- store is physically connected). I rebooted rosemary, but the reboot showed problems with its IPv6-address. I tried to fix that what caused several other reboots. Rosemary is now up and running but the user-store is not available (looks like Nosy just mounted it without updating the fstab-file). So I was forced to remove the user-store everywhere (beside on willow because it need a reboot to do that and a reboot is scheduled already later for today). I will try if I can find the partition for user-store and mount it but I have not much hope (there are way to many devices to try) – just to be clear: There is no data lost. Also away will be munin, because its data is also mounted on that host. I fear that we have to wait for Nosy to recover before we get the user-store back.
tl;dr: TS had problems, user-store is away.
Sincerely, DaB.
(anonymous) wrote:
great parts of the toolserver-cluster were down or very slow in the last few hours. AFAIS it was a problem with the user-store or rosemary (where the user- store is physically connected). I rebooted rosemary, but the reboot showed problems with its IPv6-address. I tried to fix that what caused several other reboots. Rosemary is now up and running but the user-store is not available (looks like Nosy just mounted it without updating the fstab-file). So I was forced to remove the user-store everywhere (beside on willow because it need a reboot to do that and a reboot is scheduled already later for today). I will try if I can find the partition for user-store and mount it but I have not much hope (there are way to many devices to try) – just to be clear: There is no data lost. Also away will be munin, because its data is also mounted on that host. I fear that we have to wait for Nosy to recover before we get the user-store back.
[...]
Couldn't the search not be automated à la (untested):
| mkdir t | for DEVICE in part1 part2; do | mount -o ro $DEVICE t || continue | [ -e t/sge ] && echo "Found ./sge on $DEVICE" | umount t | done
(with "sge" being just the first example of a directory be- neath /mnt/user-store that I can remember).
Tim
Hello, At Monday 11 February 2013 20:20:32 DaB. wrote:
Couldn't the search not be automated à la (untested):
several important partitions (for example database-partitions) live on this SAN. I have no idea what happens if a partition is mounted on 2 hosts, so I like to avoid blind mounting.
Sincerely, DaB.
"DaB." WP@daniel.baur4.info wrote:
Couldn't the search not be automated à la (untested):
several important partitions (for example database-partitions) live on this SAN. I have no idea what happens if a partition is mounted on 2 hosts, so I like to avoid blind mounting.
On Linux and ext3/ext4, you should be pretty safe with "-o ro,noload", but given the risk compared to the possible gain, I think waiting for nosy's (family's) convalescence is more prudent :-).
Tim
Hello all, At Tuesday 12 February 2013 15:55:34 DaB. wrote:
I fear that we have to wait for Nosy to recover before we get the user-store back.
just got an eMail by Nosy that the user-store is back. I checked willow and while it is a little bit slow (and very full) it seems to work.
Sincerely, DaB.
toolserver-l@lists.wikimedia.org