Hello, At Sunday 23 September 2012 21:01:27 DaB. wrote:
Hello,
At Sunday 23 September 2012 20:30:29 DaB. wrote:
Since about an hour the web servers appear to be unresponsive:
- http://ortelius.toolserver.org/~cvn/index.html
- http://wolfsbane.toolserver.org/~cvn/index.html
- https://toolserver.org/~cvn/index.html
All error out on with no response and a time out.
I can still SSH into wolfsbane and ortelius from willow, though.
I will now investigate this. Until now the only problem I found is that hemlock is down.
I restored the web-access now. As far as I see hemlock lost its external array and became out of memory around 2:30 UTC. I have no idea why this influence our webserver. I rebooted hemlock to free the memory and restarted the webserver on ortelius and wolfsbane; the webpages are back AFAIS. What is not working at the moment is the user-store and our backup, because both are on the external array of hemlock. Also not working is munin, which is handled by hemlock. I will try to fix all this, but I guess I need nosy for that (and in the worst case Mark in the colo).
Sincerely, DaB.
Sincerely, DaB.
Hello again, At Sunday 23 September 2012 23:47:18 DaB. wrote:
I restored the web-access now. As far as I see hemlock lost its external array and became out of memory around 2:30 UTC. I have no idea why this influence our webserver. I rebooted hemlock to free the memory and restarted the webserver on ortelius and wolfsbane; the webpages are back AFAIS. What is not working at the moment is the user-store and our backup, because both are on the external array of hemlock. Also not working is munin, which is handled by hemlock. I will try to fix all this, but I guess I need nosy for that (and in the worst case Mark in the colo).
just an update: Nosy came online and fixed the partitions so they were recognized again by hemlock. Than we rebooted hemlock to check if everything was working again, and that failed. So we fixed it again (what needed another reboot) and manually fixed the non-working stuff. So in short: Everything should work again now (as long as hemlock will not be rebooted). I will try to investigate why Zeus (our webserver-program) was not working without hemlock in the next day. We also will have a maintenance- window later this week (will tell details in another mail) to fix hemlock properly.
Sincerely, DaB.
toolserver-announce@lists.wikimedia.org