On Mon, Nov 28, 2011 at 8:28 PM, Neil
Harris<neil(a)tonal.clara.co.uk> wrote:
And adding memcached caching with even, say, as
little as a 1 minute cache
entry timeout, should dilute that reduced load even more, and put an
upperbound on the load generated, just in case it gets slashdot/reddited
again.
It was already in memcached, cached for 15 minutes. However, if
recaching takes a long time and your page gets a lot of traffic, you
can get a cache stampede (just like when Michael Jackson died): while
the recache is in progress, there are more hits for your page and a
zillion Apache workers all race to rebuild the cache, unaware of each
other. I have no evidence that that's what happened, but that's my
theory. Making the recache faster and/or upping the cache timeout
reduces the size and the frequency, respectively, of the window in
which this can happen.
The cache stampede problem was solved for the particular case of the
parser cache using PoolCounter, but I don't think it's necessary for
other types of caching. Computing fundraiser statistics simply
shouldn't be that slow.
Roan
I hadn't thought properly about cache stampedes: since the parser cache
is only part of page rendering, this might also explain some of the
other occasional slowdowns I've seen on Wikipedia.
It would be really cool if there could be some sort of general mechanism
to enable this to be prevented this for all page URLs protected by
memcaching, throughout the system.
-- N.