On Mon, Nov 28, 2011 at 12:06 PM, Roan Kattouw <roan.kattouw(a)gmail.com>wrote;wrote:
On Mon, Nov 28, 2011 at 8:59 PM, Neil Harris
<neil(a)tonal.clara.co.uk>
wrote:
I hadn't thought properly about cache
stampedes: since the parser cache
is
only part of page rendering, this might also
explain some of the other
occasional slowdowns I've seen on Wikipedia.
It would be really cool if there could be some sort of general mechanism
to
enable this to be prevented this for all page
URLs protected by
memcaching,
throughout the system.
I'm not very familiar with PoolCounter but I suspect it's a fairly
generic system for handling this sort of thing. However, stampedes
have never been a practical problem for anything except massive
traffic combined with slow recaching, and that's a fairly rare case.
So I don't think we want to add that sort of concurrency protection
everywhere.
For memcache objects that can be grouped together into an "ok to use if a
bit stale" bucket (such as all kinds of stats), there is also the
possibility of lazy async regeneration.
Data is stored in memcache with a fuzzy expire time, i..e { data:foo,
stale:$now+15min } and a cache ttl of forever. When getting the key, if
the time stamp inside marks the data as stale, you can 1) attempt to obtain
a exclusive (acq4me) lock from poolcounter. If immediately successful,
launch an async job to regenerate the cache (while holding the lock) but
continue the request with stale data. In all other cases, just use the
stale data. Mainly useful if the regeneration work is hideously expensive,
such that you wouldn't want clients blocking on even a single cache regen
(as is the behavior with poolcounter as deployed for the parser cache.)