Hi,
from about 3:00Z to about 3:20Z, no login was possible to
nightshade and yarrow, (not existing) passwords required for
willow and the webserver returned 404s. MZMcBride had an
open session into willow, and loads of accessible servers
were within limits
(cf.
http://p.defau.lt/?e_zsJIW_rAbfR3Cvlvx9Uw), but reserve
lookup of user names was broken
(cf.
http://p.defau.lt/?asmBijtXnvzQacz1e8JXOQ) and
ldapsearch timed out as well
(cf.
http://p.defau.lt/?P47PCC3_1d3mnoLyVqFUqQ). This looks
like a failure of the LDAP server.
Two other issues surfaced at that time:
-
http://nagios.toolserver.org/ gave 500s during the outage.
I asked Coren to consult with WMF if there are possibili-
ties to outsource (or integrate :-)) this monitoring to
their existing infrastructure
(
http://icinga.wikimedia.org/).
- The listed mail address for the Toolserver admins is
ts-admins(a)toolserver.org. While this may work during such
an outage (I didn't try) and personal mail addresses for
admins can be found in the toolserver-announce archives,
we should prefer an address routed externally, and trying
not to be too imaginative I propose:
ts-admins(a)wikimedia.de.
Tim