Hi,
from about 3:00Z to about 3:20Z, no login was possible to nightshade and yarrow, (not existing) passwords required for willow and the webserver returned 404s. MZMcBride had an open session into willow, and loads of accessible servers were within limits (cf. http://p.defau.lt/?e_zsJIW_rAbfR3Cvlvx9Uw), but reserve lookup of user names was broken (cf. http://p.defau.lt/?asmBijtXnvzQacz1e8JXOQ) and ldapsearch timed out as well (cf. http://p.defau.lt/?P47PCC3_1d3mnoLyVqFUqQ). This looks like a failure of the LDAP server.
Two other issues surfaced at that time:
- http://nagios.toolserver.org/ gave 500s during the outage. I asked Coren to consult with WMF if there are possibili- ties to outsource (or integrate :-)) this monitoring to their existing infrastructure (http://icinga.wikimedia.org/).
- The listed mail address for the Toolserver admins is ts-admins@toolserver.org. While this may work during such an outage (I didn't try) and personal mail addresses for admins can be found in the toolserver-announce archives, we should prefer an address routed externally, and trying not to be too imaginative I propose: ts-admins@wikimedia.de.
Tim