On 28/12/2009 08:13 PM, Mike.lifeguard wrote:
Which admins? I know River is very experienced with
Solaris - but are
the other sysadmins more experienced with Solaris than linux as well?
We had a long discussion about this in the admins IRC channel... we all
agree that having two separate OSs in the platform is bad, and that
either everything should run Linux, or everything should run Solaris.
So the only decision is which OS we should use.
(A third option would be to ditch both Linux and Solaris and move to
another platform entirely... while I'm always looking for alternative
solutions to consider, this is very unlikely to happen any time soon, so
it's irrelevant to this discussion.)
From my point of view, there are three distinct admin tasks which are
regularly needed for the TS. The first is general housekeeping;
monitoring servers, killing errant processes, and so on. There is
little difference between Linux and Solaris here, so we can ignore that.
The second is software installation. Debian has a large pre-built
repository of software, while for Solaris we maintain our own packages,
which means new software has to be built and packaged for installation.
(Which is usually fairly simple, but takes a bit longer than running
"apt-get".)
I don't see this as a particularly critical issue. Even if I end up
having to build all the software myself, it won't add a significant
amount of work to my existing workload.
The third issue is more intricate maintenance and debugging work needed
to keep the system running. This includes things like network
installation profiles to allow reproducable server configurations,
through performance tuning, OS patching, etc.; and debugging: MySQL, for
example, is a complicated piece of software with many subtle
interactions with the OS, and maintaining it requires good knowledge of
both the OS and MySQL itself.
At the moment, the vast majority of this sort of work is done by me,
which works because I have a good understanding of, and real-world
experience with, the operating system we use (Solaris), and also
experience using Solaris with the other software we use. On the other
hand, I rarely use Linux, and have no experience of using Linux in
complex systems like the Toolserver, or with MySQL.
Obviously I can install a Linux system and run MySQL on it, but the
moment we run into some subtle operating system issue interaction or
performance issue with MySQL, it's going to take me a lot longer to fix
it than it would if we ran Solaris. It therefore seems to me that if we
moved to Linux, the overall reliability of the Toolserver platform would
be degraded.
The obvious counter-point here is that if all systems ran Linux, even if
I wasn't able to fix it, we would have a larger pool of admins with
Linux experience who could. However, most of our admins have little
time to spend on the Toolserver, and indeed some admins we've added in
the past have effectively disappeared due to other commitments. Of the
three currently active administrators other than myself, one is
self-admittedly not an expert on either platform, and one has little
time to spend on the Toolserver other than the work he already does, so
that leaves one person with Linux experience. Moving most of the
workload from one person to another person doesn't seem like any
advantage at all.
One or two other people have offered their time to the Toolserver in the
past, and have some amount of Linux experience, but given our experience
with adding additional admins so far, I'm reluctant to keep adding
people who disappear or only have time to perform housekeeping tasks;
I'd rather wait for someone who can clearly commit a significant amount
of time to the project.
Then again, what happens if a bus comes along and we
then have admins
who know linux best?
This is a valid concern, but if we used Linux, and our single active
Linux admin was hit by a bus, we'd have the same problem. I hope there
will be some changes over the next year or two which will make this
issue moot... until then, I will try to avoid busses.
- river.