Jimmy Wales wrote:
They are flexible on the amounts -- it is essentially
whatever we need
and can honestly justify. They are tech savvy and fully agree with my
view that just throwing random money and servers at us is not the best
use of our resources -- rather, they prefer that we ask for what we can
really use in a way that optimizes the use of their money -- best
strategy for everyone, obviously.
SOOOOOOOOO......
Feedback?
--Jimbo
From the "big picture" viewpoint, this occurs to me:
From watching the dev wiki and developer log, I can see that a huge
amount of admin effort is devoted to change
management: keeping different installs in sync with one another,
deploying new software, load-balancing, and cutting
over from failed servers to new servers.
As the number of different server farms worldwide gets larger, with
different configurations in different places,
the amount of sysadminning is likely to grow exponentially with the
number of different possible combinations of
software and hardware, rather than linearly with the number of systems
to be administered. Simply scaling up the sysadmin
team to cope will be difficult, so automation is the only way forward in
the medium and long term.
There are a number of tools which can help with this:
* distribution of software using (say) debian packages, so software
dependencies are greatly reduced
* the use of mass-scripting tools for software configuration and rollout
* the use of, or development of, generic tools for detection,
deployment, and load-balancing of resources (and
shutting them down if they're broken, up to and including STONITH).
Although this sounds very fancy, it need not be. There are already a
number of good free software packages
already available for these jobs:
For example, the Debian packaging tools will do very nicely for the
first case; tools like cfengine are designed to
do the second (there are lots of other alternatives), and Wikipedia is
already using perlbal, which is a simple version of
the sort of software needed to do the third part. (Note that something
like cfengine can be used to do the actual grunt work for the
auto-deployment of resources on nodes, or the shutdown of failed nodes).
Considering this sort of auto-administration now, whilst Wikipedia is
still relatively monolithic, would be a good idea;
the alternative is to be driven to do it at a later date in an unplanned
way.
Regards,
Neil