[Labs-l] open grid on bots

Marc A. Pelletier marc at uberbox.org
Mon Mar 11 15:03:53 UTC 2013


On 03/11/2013 05:14 AM, Petr Bena wrote:
> the lack of sysadmins was one of biggest problems of toolserver - just
> creation of account took ages and there was nearly no support at all -
> that's what I am afraid your project is heading to. It will be
> perfectly designed cluster maintained by 1 person.

I think you're missing the point that part of the reason why it's 
advantageous to move to WMF hosting is exactly the opposite; this means 
that ultimately, you get the weight of ops behind the infrastructure 
rather than just isolated sysadmins (volunteer or not).  Add to that 
that we get to leverage the technical resources already in place and we 
end up with an infrastructure that is much /less/ dependent on sysadmin 
intervention to run.

I'm a good sysadmin, which means I am a *lazy* sysadmin.  I can 
guarantee you that one of my primary objectives is that nobody needs to 
wait on me to do anything for normal tool writing and maintenance!  :-)  
What isn't currently automated will be configured to be self-serve -- as 
long as it does not impact reliability of other tools.  If I do my job 
right, all of my time will be spent sharing knowledge with maintainers 
and coping with hardware failures, not doing gruntwork.

That said, I don't believe there is anything wrong with volunteer 
sysadmins, and my understandting is that this is indeed something which 
we may look forward to in the future (although, admitedly, not this 
early in the Tool Labs life cycle).  But it's important that you also 
understand that objectives of reliability are best served by limiting 
the number of people who can be root, and to "formalize" a bit the way 
things are done.  Yes, this /does/ have the downside that some things 
are going to be a bit slower to do; but I want to be able to tell 
maintainers that "if your tool works now, it's not going to break 
tomorrow" and that means being a bit more disciplined and, yes, a bit 
more restrictive in how we do things.

In the meantime, part of the reason the WMF pays me is to make sure that 
there /is/ someone available to help.  Even when I'm not "on the clock", 
you'll find me easy to reach and responsive; when I'm not near IRC, I'm 
still reachable by email; and once the tools project is well on its way 
the other members of the ops team will also be able to react in case of 
emergencies.

-- Marc




More information about the Labs-l mailing list