On Mon, Nov 28, 2011 at 12:06 PM, Roan Kattouw roan.kattouw@gmail.comwrote:
On Mon, Nov 28, 2011 at 8:59 PM, Neil Harris neil@tonal.clara.co.uk wrote:
I hadn't thought properly about cache stampedes: since the parser cache
is
only part of page rendering, this might also explain some of the other occasional slowdowns I've seen on Wikipedia.
It would be really cool if there could be some sort of general mechanism
to
enable this to be prevented this for all page URLs protected by
memcaching,
throughout the system.
I'm not very familiar with PoolCounter but I suspect it's a fairly generic system for handling this sort of thing. However, stampedes have never been a practical problem for anything except massive traffic combined with slow recaching, and that's a fairly rare case. So I don't think we want to add that sort of concurrency protection everywhere.
For memcache objects that can be grouped together into an "ok to use if a bit stale" bucket (such as all kinds of stats), there is also the possibility of lazy async regeneration.
Data is stored in memcache with a fuzzy expire time, i..e { data:foo, stale:$now+15min } and a cache ttl of forever. When getting the key, if the time stamp inside marks the data as stale, you can 1) attempt to obtain a exclusive (acq4me) lock from poolcounter. If immediately successful, launch an async job to regenerate the cache (while holding the lock) but continue the request with stale data. In all other cases, just use the stale data. Mainly useful if the regeneration work is hideously expensive, such that you wouldn't want clients blocking on even a single cache regen (as is the behavior with poolcounter as deployed for the parser cache.)