I've implemented a system for MediaWiki. 404 handler caching works by keeping copies of articles in a directory on the server in HTML format; if a page is missing from the directory, an error document (AKA 404 handler) is called by the server. If the 404 handler is MediaWiki, it can render the page to the client and cache that version to the directory, so that next time the Apache server will just serve the file directly. There's more on the technique here:
http://meta.wikimedia.org/wiki/404_handler_caching
Anyways, I did an implementation as a MediaWiki extension; the code is at http://wikitravel.org/~evan/Cache404.php.txt . There's no documentation to speak of; this is just an alpha version.
I've got a test wiki working with it on Wikitravel: http://wikitravel.org/test/ .
If you're at all interested, well, now you know.
~ESP
On Nov 16, 2004, at 4:25 PM, Evan Prodromou wrote:
I've implemented a system for MediaWiki. 404 handler caching works by keeping copies of articles in a directory on the server in HTML format; if a page is missing from the directory, an error document (AKA 404 handler) is called by the server. If the 404 handler is MediaWiki, it can render the page to the client and cache that version to the directory, so that next time the Apache server will just serve the file directly.
Spiffy. There are some obvious problems of course, namely the loss of all the user options and message notifications... You could probably use mod_rewrite to shunt users with login/session cookies over to a non-cached directory, which might still be a win over the old file cache (which currently requires loading up all the PHP scripts and touching the database before determining whether it should pass through the saved file).
Alternatively, loading the HTML from the _start_ of the PHP script (before loading up the whole MediaWiki class structure) should still be reasonably fast and would be easier to set up on sites that don't have much control over the web server setup. This would require appropriate purging on edit, just as the existing Squid cache and your experimental code do. You might be able to merge the code to support both this mode and a 404-handler mode.
Your client-side cache problem on edit should be solved by sending appropriate cache-control headers to force the client to re-check on every hit. I'm not sure offhand how to get Apache to do that, but I suspect it's doable.
The template invalidation list query is really ugly; we really should remember to stuff things in a template link table for 1.4; it shouldn't be that hard. Also, think about merging off the Squid code which is currently special-cased in a bunch of places. There shouldn't be a need to have two separate 'give me a list of pages to invalidate' functions.
-- brion vibber (brion @ pobox.com)
On Tue, 2004-16-11 at 16:56 -0800, Brion Vibber wrote:
Spiffy. There are some obvious problems of course, namely the loss of all the user options and message notifications...
Well, so, here's my theory: if we get an email-notification (or even IM/Jabber notification... mmmm!) built-in, it will make new-message notifications much less crucial.
Also, I think that it's possible to do some CSS, cookies and Javascript-based client-side skinning that would obviate the need for most of the user preferences we have now. I'm thinking something along these lines...
http://www.alistapart.com/articles/alternate/
...but I think that's where we're going to have to go with Wikitravel. We just can't afford the cycles to figure out who likes ? and who likes red paint for broken links.
Your client-side cache problem on edit should be solved by sending appropriate cache-control headers to force the client to re-check on every hit. I'm not sure offhand how to get Apache to do that, but I suspect it's doable.
I'm thinking about trying to tag a random URL parameter at the end, like:
http://example.com/wiki/Some_title_here?random=04567AC1
...but there might be an easier way to do that.
The problem is hard to reproduce, though.
The template invalidation list query is really ugly; we really should remember to stuff things in a template link table for 1.4; it shouldn't be that hard.
Yeah, it's a bummer. A mitigating factor for Wikitravel is that we don't use a lot of templates yet.
Another thing I'd really love is to concentrate on for 1.4 is building extension modules rather than a monolithic application. I'd really like to enable some kind of hooks-processing to make building behaviour extensions easier.
~ESP
wikitech-l@lists.wikimedia.org