On Wed, Jul 25, 2007 at 11:53:36AM -0400, Travis Derouin wrote:
Hi,
It's been some time since we've expanded our server configuration at
wikiHow. We currently have 5 servers (1 Squid, 3 Apache and 1 DB) and
I have some questions about planning for future growth. We do make use
of memcached and eaccelerator, Squid occasionally reaches 300 requests
per second, and we get about 5.5 million unique visitors a month
growing at about 10% per month. All servers are dual Opteron 64bit,
4GB of RAM, with 3x73GB SCSI drives.
Does anyone know how it's possible to estimate how much available
cushion there is for a given Apache server?
Look at its CPU usage. If it's near 100% during peak time, it will not
take that much additional load. If it's lower, you can extrapolate how
many more request the server might be able to take.
Is there an upper limit on the number of requests per
second or server load?
That depends highly on your wiki - read vs edit ratio, the kind of
extensions and templates you're using.
Basically, at what
point can you be sure that you need to add more Apache servers?
Sure? Well, when users are complaining about performance, I'd say.
The ratio of Squid to Apache server should be about 1:10, correct?
Once you add a 2nd Squid, what the best way to load balance between
the 2?DNS round robin or use a hardware load balancer?
We used round robin for quite a while - and it was a mess, we had to
switch IPs when a server failed, very much manual intervention was
needed (we had a not so stable squid version with a not so stable kernel
on not so stable hardware, so we practiced this quite often). These
days, we use an old 3GHz Pentium 4 Server with LVS as software load balancer.
It's currently handling about 6'000 req/s with a CPU usage of 20%
When should a 2nd database server should be added? Are
there any other
optimizations we could be benefiting from?
Hard to say. Look at the cache hit ratio. How much time are the apaches
waiting for the DBs? How much memory and how many disks has your DB
server?
We increased the DB cache hit ratio by moving the text of the revisions
to mysql instances running on the apache servers ("external storage").
If anyone has any input, it would be much appreciated.
Look also at mark's presentation at
http://www.nedworks.org/~mark/presentations/san/Wikimedia%20architecture.pdf
Or Domas' presentation at
http://dammit.lt/uc/workbook2007.pdf
Or my presentation (German only) at
http://jeluf.mormo.org/PDF/WikimediaTech.pdf
I also found this book quite interesting, they've faced the same
problems as we - and came up with very similar solutions.
http://www.oreilly.com/catalog/web2apps/
Regards,
jens