Tim Starling wrote:
My suggestion for dealing with this problem was to
split the cluster, to
dedicate small amount of hardware to the smaller projects, and then have
a more liberal access policy for that dedicated hardware. However Jimbo
was strongly against it.
My mind could be changed about this, if there was consensus that it is a
good way to move forward. Certainly this rationale makes a lot of sense
to me, and it was a rationale that I never really understood before.
The one major downside I see to splitting the cluster comes if we have a
divergence of management practices across different clusters. It's
pretty efficient to have a single cluster managed by a single team. In
practice I'm hoping that as we add more datacenters they will be
accessible to developers in a nearly transparent way -- it doesn't make
sense to me that some people are responsible for managing one cluster
and some people for another, if it means a lot of duplication of effort
in things like writing scripts, etc.
There are obviously some upsides as well. If different projects are on
different hardware, then a crash doesn't take everything out
simultaneously. If different projects are on different hardware, we can
relax *a little bit* about who has shell access, since we could in
theory have people who have access only to their own specialty.
I can't be certain, but I can be almost certain, that we are in
uncharted territory here. I don't think there is any other web serving
organization pushing this many web pages out of this much hardware with
volunteers self-organizing in the crazy wiki way that we do things.
So we're pretty much on our own in figuring out the social and technical
ways to keep scaling... we can't really look to any other organization
for the answers.
--Jimbo