Re: [Wikitech-l] Re: squid idea summary

3 Jan 2004


      On Jan 3, 2004, at 13:46, Gabriel Wicke wrote:
...
What kind of caching is done at the moment? And what are the current
timeouts?
Every page view comes in through wiki.phtml as the entry point. This 
runs some setup code, defines functions/classes etc, connects to the 
database, normalizes the page name that's been given, and checks if a 
login session is active, loading user data if so.
Then the database is queried to see if the page exists and whether it's 
a redirect, and to get the last-touched timestamp.
If the client sent an If-Modified-Since header, we compare the given 
time against the last-touched timestamp (which is updated for cases 
where link rendering would change as well as direct edits). If it 
hasn't changed, we return a '304 Not Modified' code. This covers about 
10% of page views.
If it's not a redirect, we're not looking at an old revision, diff, or 
"printable view", and we're not logged in, the file cache kicks in. 
This covers some 60% of page views. If saved HTML output is found for 
this page, it's date is checked. If it's still valid, the file is 
dumped out and the script exits. The cache file is a complete gzipped 
HTML page; if the browser doesn't advertise understanding gzip, we 
decompress it on the fly. (Note that this may affect benchmarks in 
comparison to actual browsers in use, I don't know.)
If the cached page doesn't exist or is out of date, page rendering 
continues as normally, and the output is compressed and saved at the 
end. About 2% of page views involve saving a new cached page.
There's no timeout; pages are invalidated immediately by updating their 
last-touched timestamps. A global cache epoch can be set on the server 
to invalidate all old cached pages (server- or client-side), and 
individual user accounts also have a cache epoch which is reset on 
login, when user options are changed, and when talk page notification 
comes on/off.
If this is a redirect, old page view, diff, or printable view, or if 
the user is logged in, then we don't do any server-side caching (yet) 
and parse/render the whole page. Some speedups have been accomplished 
by precaching link lookup info in easily-loadable chunks. E23's been 
working on storage of the HTML-rendered wiki pages to be inserted into 
the overall layout, but this needs some more finalization (various user 
options may affect the rendering of the page).
Ideally we'd be putting cached data into memcached, which can run 
in-memory on the web server (or as a distributed cache over a web 
server cluster) without grinding down the disks. So far we use 
memcached just for some common data (localized messages, utf8 
translation tables, interwiki prefix lookup) and login sessions.
-- brion vibber (brion @ pobox.com)

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Re: squid idea summary