David Gerard wrote:
On 04/03/07, Brion Vibber brion@pobox.com wrote:
howard chen wrote:
so what kind of hardware/software are being used by wikipedia to host images or other resources?
NFS -> a box with a lot of disks -> lighttpd -> squid Terrifying, isn't it? :D
I got asked by a journalist to describe our technical setup the other day (after I'd pointed out we're the only top-20 website hosted by a nonprofit with therefore no resources by definition).
I said "In Tampa there's lots of big disks, these go to three big database servers - one for English Wikipedia, two for the other 200+ projects - and these go to hundreds of webservers and caches and proxies in Tampa, Paris, Amsterdam and Seoul."
Is that about right as a one-sentence description of our setup?
It's close. There are three master database servers, but they're not any bigger than the 12 or so slave database servers, so it might be more accurate to just say that we have 15 database servers. Caches aren't on dedicated servers, so I would say "webservers and caching proxies" rather than "webservers and caches and proxies", leaving the role of the various backend caches unsaid. We no longer have any active servers in Paris.
A tier you didn't mention, which the journalist may or may not find interesting, is that our load balancing frontend is LVS-DR on commodity servers, with geographic DNS to balance between clusters.
-- Tim Starling