On Jan 3, 2004, at 13:46, Gabriel Wicke wrote:
What kind of caching is done at the moment? And what
are the current
timeouts?
Every page view comes in through wiki.phtml as the entry point. This
runs some setup code, defines functions/classes etc, connects to the
database, normalizes the page name that's been given, and checks if a
login session is active, loading user data if so.
Then the database is queried to see if the page exists and whether it's
a redirect, and to get the last-touched timestamp.
If the client sent an If-Modified-Since header, we compare the given
time against the last-touched timestamp (which is updated for cases
where link rendering would change as well as direct edits). If it
hasn't changed, we return a '304 Not Modified' code. This covers about
10% of page views.
If it's not a redirect, we're not looking at an old revision, diff, or
"printable view", and we're not logged in, the file cache kicks in.
This covers some 60% of page views. If saved HTML output is found for
this page, it's date is checked. If it's still valid, the file is
dumped out and the script exits. The cache file is a complete gzipped
HTML page; if the browser doesn't advertise understanding gzip, we
decompress it on the fly. (Note that this may affect benchmarks in
comparison to actual browsers in use, I don't know.)
If the cached page doesn't exist or is out of date, page rendering
continues as normally, and the output is compressed and saved at the
end. About 2% of page views involve saving a new cached page.
There's no timeout; pages are invalidated immediately by updating their
last-touched timestamps. A global cache epoch can be set on the server
to invalidate all old cached pages (server- or client-side), and
individual user accounts also have a cache epoch which is reset on
login, when user options are changed, and when talk page notification
comes on/off.
If this is a redirect, old page view, diff, or printable view, or if
the user is logged in, then we don't do any server-side caching (yet)
and parse/render the whole page. Some speedups have been accomplished
by precaching link lookup info in easily-loadable chunks. E23's been
working on storage of the HTML-rendered wiki pages to be inserted into
the overall layout, but this needs some more finalization (various user
options may affect the rendering of the page).
Ideally we'd be putting cached data into memcached, which can run
in-memory on the web server (or as a distributed cache over a web
server cluster) without grinding down the disks. So far we use
memcached just for some common data (localized messages, utf8
translation tables, interwiki prefix lookup) and login sessions.
-- brion vibber (brion @
pobox.com)