Hello all,
after our discussion about more roots I got the impression that for some of
you the topic of more roots is quite urgent. To be honest I feel not very well
to just add a few roots at the moment.
So I thought a compromise and created a new user-group: Operators [1].
Operators have a limited set of advantage rights – enough to help the roots
and do every-day-jobs, but not enough rights to have access to sensible data
(so no approval from WMDE or WMF is necessary).
For testing I gave operator-status to the following people: Merl, who manage
SGE already, Danny_B, who manage the user-store already, and Platonides who
volunteered. There will be more in the future, but at the moment these 3 will
In the new group the operators can collect experience while helping the users
and the TS. And the roots can see who could get root-status someday and who
not. The group is also a good place for users who like to help the TS, but can
not invest the same amount of time like a root.
So let's see if this solution works.
[1] https://wiki.toolserver.org/view/Operators
Userpage: [[:w:de:User:DaB.]] — PGP: 0x2d3ee2d42b255885
Hello all,
for another kernel-update I have to reboot the linux-userland-boxes again. The
reboot will happen
TODAY, 20:00 UTC.
The reboots will happen (again) sequentially in 15min intervals. SGE will
migrate/restart your jobs to other servers during the downtimes. You can
follow the progress at [1].
Another news from the linux database-servers (sql-s2 and sql-s5-user): I still
try to find the optimal configuration. For this I have to restart mysql every
few hours to bring changes live. I try to keep the downtime there at a
minimum, but I guess slow and very outdated databases helps no one.
[1] https://jira.toolserver.org/browse/MNT-1300
Userpage: [[:w:de:User:DaB.]] — PGP: 0x2d3ee2d42b255885
from about 3:00Z to about 3:20Z, no login was possible to
nightshade and yarrow, (not existing) passwords required for
willow and the webserver returned 404s. MZMcBride had an
open session into willow, and loads of accessible servers
were within limits
(cf. http://p.defau.lt/?e_zsJIW_rAbfR3Cvlvx9Uw), but reserve
lookup of user names was broken
(cf. http://p.defau.lt/?asmBijtXnvzQacz1e8JXOQ) and
ldapsearch timed out as well
(cf. http://p.defau.lt/?P47PCC3_1d3mnoLyVqFUqQ). This looks
like a failure of the LDAP server.
Two other issues surfaced at that time:
- http://nagios.toolserver.org/ gave 500s during the outage.
I asked Coren to consult with WMF if there are possibili-
ties to outsource (or integrate :-)) this monitoring to
their existing infrastructure
- The listed mail address for the Toolserver admins is
ts-admins(a)toolserver.org. While this may work during such
an outage (I didn't try) and personal mail addresses for
admins can be found in the toolserver-announce archives,
we should prefer an address routed externally, and trying
not to be too imaginative I propose: