Hi,
It's been some time since we've expanded our server configuration at wikiHow. We currently have 5 servers (1 Squid, 3 Apache and 1 DB) and I have some questions about planning for future growth. We do make use of memcached and eaccelerator, Squid occasionally reaches 300 requests per second, and we get about 5.5 million unique visitors a month growing at about 10% per month. All servers are dual Opteron 64bit, 4GB of RAM, with 3x73GB SCSI drives.
Does anyone know how it's possible to estimate how much available cushion there is for a given Apache server? Is there an upper limit on the number of requests per second or server load? Basically, at what point can you be sure that you need to add more Apache servers?
The ratio of Squid to Apache server should be about 1:10, correct? Once you add a 2nd Squid, what the best way to load balance between the 2?DNS round robin or use a hardware load balancer?
When should a 2nd database server should be added? Are there any other optimizations we could be benefiting from?
If anyone has any input, it would be much appreciated. Travis
On Wed, Jul 25, 2007 at 11:53:36AM -0400, Travis Derouin wrote:
Hi,
It's been some time since we've expanded our server configuration at wikiHow. We currently have 5 servers (1 Squid, 3 Apache and 1 DB) and I have some questions about planning for future growth. We do make use of memcached and eaccelerator, Squid occasionally reaches 300 requests per second, and we get about 5.5 million unique visitors a month growing at about 10% per month. All servers are dual Opteron 64bit, 4GB of RAM, with 3x73GB SCSI drives.
Does anyone know how it's possible to estimate how much available cushion there is for a given Apache server?
Look at its CPU usage. If it's near 100% during peak time, it will not take that much additional load. If it's lower, you can extrapolate how many more request the server might be able to take.
Is there an upper limit on the number of requests per second or server load?
That depends highly on your wiki - read vs edit ratio, the kind of extensions and templates you're using.
Basically, at what point can you be sure that you need to add more Apache servers?
Sure? Well, when users are complaining about performance, I'd say.
The ratio of Squid to Apache server should be about 1:10, correct? Once you add a 2nd Squid, what the best way to load balance between the 2?DNS round robin or use a hardware load balancer?
We used round robin for quite a while - and it was a mess, we had to switch IPs when a server failed, very much manual intervention was needed (we had a not so stable squid version with a not so stable kernel on not so stable hardware, so we practiced this quite often). These days, we use an old 3GHz Pentium 4 Server with LVS as software load balancer. It's currently handling about 6'000 req/s with a CPU usage of 20%
When should a 2nd database server should be added? Are there any other optimizations we could be benefiting from?
Hard to say. Look at the cache hit ratio. How much time are the apaches waiting for the DBs? How much memory and how many disks has your DB server?
We increased the DB cache hit ratio by moving the text of the revisions to mysql instances running on the apache servers ("external storage").
If anyone has any input, it would be much appreciated.
Look also at mark's presentation at http://www.nedworks.org/~mark/presentations/san/Wikimedia%20architecture.pdf
Or Domas' presentation at http://dammit.lt/uc/workbook2007.pdf
Or my presentation (German only) at http://jeluf.mormo.org/PDF/WikimediaTech.pdf
I also found this book quite interesting, they've faced the same problems as we - and came up with very similar solutions. http://www.oreilly.com/catalog/web2apps/
Regards,
jens
That's great, thanks a lot for all the info!
On 7/25/07, Jens Frank jf@mormo.org wrote:
On Wed, Jul 25, 2007 at 11:53:36AM -0400, Travis Derouin wrote:
Hi,
I have another question if anyone is able to answer it:
What are the peak and average requests per second for a Squid and Apache server for Wikipedia?
Thanks, Travis
On 7/27/07, Travis Derouin travis@wikihow.com wrote:
That's great, thanks a lot for all the info!
On 7/25/07, Jens Frank jf@mormo.org wrote:
On Wed, Jul 25, 2007 at 11:53:36AM -0400, Travis Derouin wrote:
Hi,
On Mon, 2007-07-30 at 11:28 -0400, Travis Derouin wrote:
What are the peak and average requests per second for a Squid and Apache server for Wikipedia?
If you're planning on running that Apache server on an SMP machine, look into running something like nginx[1] over Squid. I did some pretty heavy load testing with Squid on public:80, Apaches behind it, and with Apache in front, and Apache (2.x) won out every time with a pretty wide margin.
Squid just can't handle SMP very well, even with multiple squid instances running in front of Apache. When/if they update the code to optimize Squid to use threads, it might be worth revisiting again.
I haven't personally made the switch to nginx, but I've heard nothing but praise for its speed and efficiency.
On Mon, Jul 30, 2007 at 01:41:44PM -0400, David A. Desrosiers wrote:
On Mon, 2007-07-30 at 11:28 -0400, Travis Derouin wrote:
What are the peak and average requests per second for a Squid and Apache server for Wikipedia?
If you're planning on running that Apache server on an SMP machine, look into running something like nginx[1] over Squid. I did some pretty heavy load testing with Squid on public:80, Apaches behind it, and with Apache in front, and Apache (2.x) won out every time with a pretty wide margin.
Squid just can't handle SMP very well, even with multiple squid instances running in front of Apache. When/if they update the code to optimize Squid to use threads, it might be worth revisiting again.
I haven't personally made the switch to nginx, but I've heard nothing but praise for its speed and efficiency.
As far as I understand the nginx.net list of basic features, nginx does not offer caching. But that's exactly why we are using squids.
Regards,
jens
On 7/30/07, Travis Derouin travis@wikihow.com wrote:
I have another question if anyone is able to answer it:
What are the peak and average requests per second for a Squid and Apache server for Wikipedia?
~33K IIRC.
Thanks,
Travis
On 7/27/07, Travis Derouin travis@wikihow.com wrote:
That's great, thanks a lot for all the info!
On 7/25/07, Jens Frank jf@mormo.org wrote:
On Wed, Jul 25, 2007 at 11:53:36AM -0400, Travis Derouin wrote:
Hi,
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org http://lists.wikimedia.org/mailman/listinfo/wikitech-l
MinuteElectron.
On 7/30/07, Minute Electron minuteelectron@googlemail.com wrote:
On 7/30/07, Travis Derouin travis@wikihow.com wrote:
I have another question if anyone is able to answer it:
What are the peak and average requests per second for a Squid and Apache server for Wikipedia?
~33K IIRC.
That's for the entire farm, not one server.
Travis Derouin wrote:
I have another question if anyone is able to answer it:
What are the peak and average requests per second for a Squid and Apache server for Wikipedia?
Thanks, Travis
See data at http://ganglia.wikimedia.org
wikitech-l@lists.wikimedia.org