[Toolserver-l] Power outage

Brett Hillebrand bretthillebrand at internode.on.net
Wed Jun 30 13:06:29 UTC 2010


Raises an interesting question, what made the server go down in the first
place? Surely the power to the server would be 1+1? (IE: Dual Powers
supplies attached to separate power circuits, powered by separate UPS and
Generator grids respectively)

This kind of redundancy is expected in data centers now days and I assume
that all the TS servers are in a data center. Just a curious question as to
why this obviously isn't the case.

-Brett

-----Original Message-----
From: toolserver-l-bounces at lists.wikimedia.org
[mailto:toolserver-l-bounces at lists.wikimedia.org] On Behalf Of River Tarnell
Sent: Wednesday, 30 June 2010 9:24 AM
To: Wikimedia Toolserver Announcements
Cc: Wikimedia Toolserver Discussion
Subject: [Toolserver-l] Power outage

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

At about 22:30 UTC last night (Tuesday) one of our power circuits went
down for about 15 minutes.  This affected one node of the HA cluster
which was hosting the following services:

	Sun Grid Engine master server
	tsbot IRC bot
	DNS recursor
	MySQL server for sql-toolserver
	MySQL replication support infrastructure
	LDAP server

All services failed over to the other node and were online again within
22 seconds.  However, MySQL did not respond well to losing its
replication connection and had to be restarted manually, causing about
30 minutes replication lag.

	- river.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.15 (FreeBSD)

iEYEARECAAYFAkwqh5YACgkQIXd7fCuc5vKQTwCgm1RTRwqQcaSpg7M4IL7brHGi
ZF0An0EuetoI6PyQ9/1KYwBIAoym+Ta6
=omgy
-----END PGP SIGNATURE-----

_______________________________________________
Toolserver-l mailing list (Toolserver-l at lists.wikimedia.org)
https://lists.wikimedia.org/mailman/listinfo/toolserver-l
Posting guidelines for this list:
https://wiki.toolserver.org/view/Mailing_list_etiquette




More information about the Toolserver-l mailing list