> I've often wondered this, so this is a great opportunity to jump in.
> Why not cache prerendered versions of all pages? It would
> seem that the majority of hits are reads. One approach I've
> seen elsewhere is to cache a page the first time it's loaded,
> and then have writes invalidate the cache. (That way you're
> not caching pages nobody looks at.)
We have multiple caches. First of all, all pages are cached
by squids, and achieve >75% hitrates for anonymous users.
Cached objects are invalidated by HTCP CLR message sent
via multicast to our global squid deployment.
Using squid caching also provides us easy way to
bring lots of content closer to users, thus reducing page load
times dramatically for anons (and a bit for logged in users).
If we get possibilities to deploy caches in Australia and China,
that'd be awesome. Right now we're in search for Chinese and
Australian locations though (kind of 3 or even 1 server deployments
would save huge countries:)
We cannot cache pages for logged in users, as they can be
different, though, at some time we might achieve that... Though now,
There is also parser cache, which caches documents for
logged in users as well. We try to increase efficiency of that as well.
> One tricky part is that writes to page B can affect page A
> if page A has a link to B. A reverse index of links would
> solve this, though I don't know how bit it'd be.
We have reverse index of links, and we use that for invalidating
both parser cache and squid cache objects.
Domas