Hello all,
just a little story of what happened today: As you know I planed to dump the
user-databases of rosemary today to import them on thyme later. Around 12
o'clock CET I looked at the replag of thyme during a break and everything was
fine. After my dinner I looked in my mails seeing an email from the OSM-guys
complaining that their title-dir was away. As a background information: thyme
carries the nfs-server of the user-store, title and munin – these are normally
on hemlock, but because hemlock's SAN-card is broken we had to move them to
another server.
Short time later I spoke with Nosy at IRC about thyme. She told me that thyme
is inaccessible by SSH. Few days ago we had discovered that thyme's serial-
console was not working (we have put that on the datacenter-to-do-list). But
without SSH and serial-console you can not even reboot a server neither
access. Nosy had started to move the nfs-server from thyme to rosemary and we
completed that together.
Because of the missing user-store the script that checks your quota at login
failed and login to linux-servers was hardly possible. I deleted the script on
these boxes and added a quick&dirty-fix to puppet. These fix failed later making
the login at the linux-boxes impossible for some time (even for roots).
The switching of the user-store from thyme to rosemary made some problems on
the userland-servers (because user-store was busy), but I think we fixed this.
Maybe we have to reboot some boxes in the next days – I will send a mail if
needed.
Thyme also carried my wikidata-replication-program which failed too (so the
replag of wikidata everywhere increased). I moved it to another server now.
A strange thing is that the mysql-process on thyme is still running; even
replication is working so the replag will not increase there.
The next step is to reach Mark or someone from the datacenter to reboot thyme
and then look where the problem was. Munin shows nothing abnormal.
Just to let you know. Good night.
Sincerely,
DaB.
--
Userpage: [[:w:de:User:DaB.]] — PGP: 2B255885