On Wed, Jul 25, 2007 at 11:53:36AM -0400, Travis Derouin wrote:
Hi,
It's been some time since we've expanded our server configuration at wikiHow. We currently have 5 servers (1 Squid, 3 Apache and 1 DB) and I have some questions about planning for future growth. We do make use of memcached and eaccelerator, Squid occasionally reaches 300 requests per second, and we get about 5.5 million unique visitors a month growing at about 10% per month. All servers are dual Opteron 64bit, 4GB of RAM, with 3x73GB SCSI drives.
Does anyone know how it's possible to estimate how much available cushion there is for a given Apache server?
Look at its CPU usage. If it's near 100% during peak time, it will not take that much additional load. If it's lower, you can extrapolate how many more request the server might be able to take.
Is there an upper limit on the number of requests per second or server load?
That depends highly on your wiki - read vs edit ratio, the kind of extensions and templates you're using.
Basically, at what point can you be sure that you need to add more Apache servers?
Sure? Well, when users are complaining about performance, I'd say.
The ratio of Squid to Apache server should be about 1:10, correct? Once you add a 2nd Squid, what the best way to load balance between the 2?DNS round robin or use a hardware load balancer?
We used round robin for quite a while - and it was a mess, we had to switch IPs when a server failed, very much manual intervention was needed (we had a not so stable squid version with a not so stable kernel on not so stable hardware, so we practiced this quite often). These days, we use an old 3GHz Pentium 4 Server with LVS as software load balancer. It's currently handling about 6'000 req/s with a CPU usage of 20%
When should a 2nd database server should be added? Are there any other optimizations we could be benefiting from?
Hard to say. Look at the cache hit ratio. How much time are the apaches waiting for the DBs? How much memory and how many disks has your DB server?
We increased the DB cache hit ratio by moving the text of the revisions to mysql instances running on the apache servers ("external storage").
If anyone has any input, it would be much appreciated.
Look also at mark's presentation at http://www.nedworks.org/~mark/presentations/san/Wikimedia%20architecture.pdf
Or Domas' presentation at http://dammit.lt/uc/workbook2007.pdf
Or my presentation (German only) at http://jeluf.mormo.org/PDF/WikimediaTech.pdf
I also found this book quite interesting, they've faced the same problems as we - and came up with very similar solutions. http://www.oreilly.com/catalog/web2apps/
Regards,
jens