On 11-03-24 07:43 PM, Tim Starling wrote:
Our parser cache hit ratio is very low, around 30%.
http://tstarling.com/stuff/hit-rate-2011-03-25.png
This seems to be mostly due to insufficient parser cache size. My theory is that if we increased the parser cache size by a factor of 10-100, then most of the yellow area on that graph should go away. This would reduce our apache CPU usage substantially.
The parser cache does not have particularly stringent latency requirements, since most requests only do a single parser cache fetch.
So I researched the available options for disk-backed object caches. Ehcache stood out, since it has a suitable feature set out of box and was easy to use from PHP. I whipped up a MediaWiki client for it and committed it in r83208.
My plan is to do a test deployment of it, starting on Monday my time (i.e. Sunday night US time), and continuing until the cache fills up somewhat, say 2 weeks. This deployment should have no user-visible consequences, except perhaps for an improvement in speed.
-- Tim Starling
Interesting. I've been self-debating mem vs. disk caches myself for awhile. I work with cloud servers a lot and while I may one day get something to a point where scaling caches and whatnot out will be important, I probably at that point won't be up to a 'collocate the servers' scale. So I've been thinking about things in the cloud limitations. On the cloud RAM is relatively expensive, there's a limit to the server size you can get, and high ram usually means really expensive cloud machines that border on "Hey, this is insane, I might as well go dedicated." but disk is readily available. And while low-latency is nice, I don't believe it's what we're aiming for when we're caching. Most of the stuff we cache in MW is not cached because we want it in a really high access low-latency way, but because the mysql queries that build them and things like parsing are so slow and expensive that we want to cache them temporarily. And in that situation it doesn't really matter if it's disk or memory cached, and larger caches can be useful.
For awhile I was thinking 'What if I give memcached on a machine of it's own a really large size and let it swap?'. But if we're looking at support for disk caches, beautiful. Especially if they have hybrid models where they keep highly accessed parts of the cache in mem and expand to the disk.
What others did you look at? From a quick look I see redis, Ehcache, JCS, and OSCache.
~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name]