Travis Derouin wrote:
I have a Squid question, thought I'd post it here since the set up is related to Wikipedia's, and no one on the Squid mailing list had any ideas...
We're running Squid in reverse proxy as an http accelerator, configured as described on meta.wikimedia.org. When our back end web server that generates the uncached content has problems (high load, etc) and becomes unresponsive, so does our front-end Squid server - that is request for cached and uncached pages become stalled.
is there a way that we can configure Squid to ignore this and continue to serve cached content to users when a certain timeout has passed when contacting the back end Apache server? it would be helpful in two ways, one, visitors accessing cached content won't experience the interruption, and two, connections to the back end server won't pile up, adding to the problem.
I don't believe we've had this problem, squid should keep serving cached pages reasonably efficiently until the filedescriptor limit is reached or you run out of CPU. You might want to check that there is in fact no communication back to the webserver when squid serves "cached" pages. It might be doing an If-Modified-Since (IMS) refresh, or a full request. If it is, then you need to make sure you're sending the right Expires and Cache-Control headers.
Make sure you're using epoll not poll. There's the possibility that your select loop is becoming too slow as the connection count increases. You'll see high system CPU and a high select loop period in cachemgr.cgi.
Although it's probably irrelevant in this case, you should also check that your squid server isn't being loaded beyond its capacity in other ways: namely disk speed, RAM and network. We've often observed cache hits becoming slower than cache misses due to squid undercapacity.
But above all, don't forget to fix your backend performance problem. It's not much of a wiki if all it can serve is cache hits. There's some advice improving performance at
http://meta.wikimedia.org/w/index.php?title=MediaWiki_FAQ&oldid=333976#M...
It seems to have been deleted from the current version of the FAQ.
-- Tim Starling