Alex J. Avriette wrote:
Order 64-bit capable machines, of course, for their security advantages, and speed advantages when running 64-bit software. Move to 64-bit software as soon as reasonably possible.
Want to voice 100% approval of this. Nobody wants to hear that Postgres compiles natively in 64-bit and makes use of it (I measured 40% increase in row-insert performance on UltraSPARC, among other things), but it bears mentioning. I don't know whether MySQL is n64'able. I can help building a 32/64 or a 64-bit gcc if anyone would like help. Opteron, Opteron, Opteron. I don't think we need 64-bit Xeon's or Itanium/Itanium2's.
However, consider the global supplier situation for Opterons. Alas, Dell don't do them. Who would be the right supplier for this?
vastly more reliable systems deployment. Consider Debian as both an operating system and a deployment system: at the moment, different machines run different operating system flavours, making sysadminning harder.
RedHat, of course, has its own products for these purposes. But I'm pretty OS agnostic.
Databases are a different issue: you can't apply the same commodity thinking: however, try to order DB machines in identical multiples, too,
Anyone know what an 8-way 848 Opteron costs these days?
for the same reasons. Clearly the DB machines will need to be hand-crafted. I don't know much about databases on this sort of scale... but I imagine the Wikipedia developers do.
First, multiplicity. Second, fast disk. Hardware raid controllers and tablespaces which allow you to put your indices on the really fast disk and your "big data" on the slower, cheaper, bigger disk. Is SAN attachment an option or are we sticking with NFS? SAN over 1gbit ether (which is where we're at) is not so bad. I mean, it could be worse.
Oh, and don't forget the Gbytes and Gbytes of RAM!
for your servers, and hence your data and syadmin sanity. An ounce of prevention is again worth a pound of cure.
Amen.
Finaly, you might consider buying a cheap radio clock on a per-site basis, if your colo does not already provide you with a local stratum-1
Am I missing something? Can we not just sync with {tick,tock}.usno.navy.mil? If usno.navy.mil goes down, we have much bigger problems than a toasted wikipedia.
Consider connectivity going down at a remote site (takes just one backhoe at the site 1/4 mile down the road where all of your supposedly "diverse" fibers join the same duct, or simply a router going mad, or someone hitting the Big Red Power-Off Button at ********* ***** [substitute your national Achilles Heel IXP]), rather than Global Thermonuclear War. Trust me, a local clock is a good thing. That's why it's such a useful service to have onsite.
Nobody's mentioned backups. Automated backups are hard to do, and manual backups require somebody to go there and swap tapes. Is a tape changer in the cards for us? We'd need to go with LTO or something.... and that's pretty ugly, money wise. Anyone?
aa
Now, that's a very good point. However, it might be better to do what Linus does and "let millions of people mirror it everywhere". This might be something to ask organizations like to UK Mirror Service, Internet Archive and Google to do on a formal basis. This way, the backups are off-site too. The current archive is 50G. If we take a week to back it up, that's a data rate of 50e9*8/(86400*7) = 662 kbps < 1 Mbps. So Wikipedia could perform complete backups weekly to three different sites at a cost of 3 Mbps sustained. That's cheaper than the cost of the tape media.
-- Neil