Re: [Wikitech-l] You have lots of money to spend ; time to get to it! :-)

19 Mar 2005


      Daniel Mayer wrote:
...
Not sure if this has been discussed yet, but we have US$95,000 budgeted for
hardware this quarter ($20,000 of which for Extra hardware/dev projects ;
meaning hardware and/or paying for critical software development). This does
not include whatever has been bought already (Jamesday ; how much have we spent
since the January order? I don't have the bank records for that yet.).
http://wikimediafoundation.org/wiki/Budget/2005
I’d really like to have a general idea on what we want to buy soon and if that
can be bought before the end of the quarter in 2 weeks, then that’d make
accounting a bit easier for me :).
Note: About $20,000 of the money generated is in the Wikimedia Deutschland bank
account. The easiest way I can see to use that money is to buy equipment that
would be used in a German-based datacenter. I’ve been told that there are many
different offers for free/subsidized hosting in Germany. 
http://wikimediafoundation.org/wiki/Fund_drives/2005/Q1
So, what do we want to buy?
Daniel Mayer (aka mav)
You asked for some opinions: well, here are my ramblings on the matter. 
I hope they are reasonably on-topic.
Don't buy cheap anything: at this scale of operations, it costs more 
than it saves. I wouldn't buy any more boxes from your previous 
suppliers: the failure rates (for whatever reasons) have been 
disastrous. Even buying Dell rack-mount boxes would be better: they're 
(fairly) reliable, (fairly) well-built, and you can buy them anywhere in 
the world, which is good for standardization. If not Dell, consider some 
other global supplier. Accept that stuff will break, and break all the 
time; a good supplier will come to your colo, take the old server away, 
and fit a new one in your rack. You'll never see the old one again: no 
waiting for return to factory, or the vendor wanting you to test it 
yourself. _This_ sort of service is what you pay the big suppliers for, 
and it's worth it, when you are running a serious 24/7 system. Also, if 
you are considering a big order to a single supplier, they might be 
willing to give you a large discount in return for being put on the 
Wikipedia Foundation's list of sponsors.
Regardless of supplier, I'd try not to buy lots of custom-tailored boxes 
for squids or apaches: just spec one generic box to handle either role, 
as well as generic glue computing (routing, load balancing, etc., etc.) 
. Put as much memory in them as you can afford, and big fat NICs. The 
money you appear to waste now, you'll save it in the future by having as 
big a pool of similar machines as possible, reducing inventory and 
spares problems. This way, you can also build standard install images, 
or netinstall images, for machines, and deploy them in a "cookie cutter" 
way when you need to clone more squids or apaches. If you configure the 
install images / packages right, you should also be able to run them as 
virtual machines during testing.
Order 64-bit capable machines, of course, for their security advantages, 
and speed advantages when running 64-bit software. Move to 64-bit 
software as soon as reasonably possible.
Consider having two of everything: two NFS servers, two mail servers, 
etc. I know, you are already part way towards this). If necessary, 
consolidate numerous low-throughput services onto a single server, then 
have a second server as a complete backup for the first. N services on a 
single server with a backup is muxh more reliable than N services on N 
servers. If you need two servers to handle the throughput, buy three.
Remember that your tightest constraint is admin skills and time; having 
a highly standardized system for deployment will save you _vast_ amounts 
of scarce time, and any deployment automation you can use will make for 
vastly more reliable systems deployment. Consider Debian as both an 
operating system and a deployment system: at the moment, different 
machines run different operating system flavours, making sysadminning 
harder.
Note that the software packaging approach for deployment also makes it 
easier to deploy in remote sites: even on hardware other than your own.
Databases are a different issue: you can't apply the same commodity 
thinking: however, try to order DB machines in identical multiples, too, 
for the same reasons. Clearly the DB machines will need to be 
hand-crafted. I don't know much about databases on this sort of scale... 
but I imagine the Wikipedia developers do.
Consider investing in some simple environmental monitoring: power drain, 
temperature, humidity, in various places. You can get rack-mount units 
for this. Overheating, or excess dryness or wetness, is seriously bad 
for your servers, and hence your data and syadmin sanity. An ounce of 
prevention is again worth a pound of cure.
Finaly, you might consider buying a cheap radio clock on a per-site 
basis, if your colo does not already provide you with a local stratum-1 
NTP feed. Keeping your distributed clusters around the world in time 
sync, even when connectivity is lost, is a good thing when you have a 
widely distributed system that wants to keep global consistency.
-- Neil

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] You have lots of money to spend ; time to get to it! :-)