At 14:03:09 UTC, our Amsterdam sever cluster suffered from a partial
power outage. One out of two power feeds in each rack went down for
approximately 6 seconds, causing servers which are not redundantly
connected to go down. The following statement was released by the data
center:
Amsterdam, 04 october 2008
Subject: Power disruption at SARA 04-10-08
On October 4th, approximately between 16:03 hrs and 18:00 hrs, a large
area of Amsterdam has suffered from a severe power outage. Unfortunately
also Science Park Amsterdam, where one of the SARA Datacenters is
located, was confronted with this failure.
The Emergency Power Supply took over the major part of the power
delivery, but unfortunately one of our four UPS systems malfunctioned.
As a result some customers experienced a short outage of approximately 6
seconds on one of their two powerfeeds. After the 6 seconds the
generators came online and provided full power on all feeds, including
the affected one. However, a small number of racks needed a manual reset
and was affected longer. The failure had consequences among others for
the internet traffic and some other SARA services. At this moment most
services are restored.
By now the external power situation is stable and normal again. SARA is
investigating the cause of the malfunction of the UPS system. If needed,
the generator will preventive be put online and running.
We are sorry for the inconvenience. If you need assistance you can
contact us for support by one of our onsite engineers.
Approximately half an hour after the start of the power loss incident,
we experienced some strange additional problems with the servers in one
of racks all shutting down in the course of a few minutes. We think this
may have occurred due to rising temperatures in that rack, or the
systems being explicitly turned off by on-site personnel. At that point,
traffic was moved away to our Amsterdam cluster. Other racks were
unaffected.
The power supply is now stable again, and traffic has been moved back to
return to the normal situation.
--
Mark Bergsma <mark(a)wikimedia.org>
System & Network Administrator, Wikimedia Foundation