While the recent changes to our Squid caching behavior mean more good stuff stays in cache, improving performance, it's also made it harder to clear bad stuff out of the cache.
One particularly irksome issue is "blank pages" -- when PHP dies due to a bug or hitting a memory or time limit, you just get blank output. When this comes before we've set our cache-control headers, the current Squid settings mean that gets cached and shown to everybody else who comes to that page.
This is rather troublesome for non-default actions and Special: pages, since an ?action=purge won't help you -- you have to ask a developer to log in and clear the URL manually. Ouch!
To combat this, I've added a default 'Cache-control: no-cache' header into our local CommonSettings.php file, which comes early in code execution.
When everything goes ok, proper caching headers will override it... but if the script dies partway through, that no-cache header gets sent out along with the blank output, and the squids won't keep it.
The next refresh or the next visitor will be a new request, and a transitory bug won't be stuck in cache.
-- brion vibber (brion @ wikimedia.org)
One particularly irksome issue is "blank pages" -- when PHP dies due to a bug or hitting a memory or time limit, you just get blank output. When this comes before we've set our cache-control headers, the current Squid settings mean that gets cached and shown to everybody else who comes to that page.
You can't modify the Squid code to include "IF size>0 THEN cache() ELSE end"?
Thomas Dalton wrote:
One particularly irksome issue is "blank pages" -- when PHP dies due to a bug or hitting a memory or time limit, you just get blank output. When this comes before we've set our cache-control headers, the current Squid settings mean that gets cached and shown to everybody else who comes to that page.
You can't modify the Squid code to include "IF size>0 THEN cache() ELSE end"?
That would solve some cases, but not others (where output is made but headers are missing). Additionally it would be a lot harder, requiring patching of C code and a rollout of new packages sitewide.
Or I could put one line of code in a PHP file and be done with it.
-- brion
On 1/16/08, Brion Vibber brion@wikimedia.org wrote:
To combat this, I've added a default 'Cache-control: no-cache' header into our local CommonSettings.php file, which comes early in code execution.
Should this be added to the software at some point?
Simetrical wrote:
On 1/16/08, Brion Vibber brion@wikimedia.org wrote:
To combat this, I've added a default 'Cache-control: no-cache' header into our local CommonSettings.php file, which comes early in code execution.
Should this be added to the software at some point?
Well, the problem it solves is a quirk of our customized Squid configuration, but it wouldn't be a bad idea in principle.
Note that the latest versions of PHP will output an HTTP 500 result code ('internal server error') on fatal error... sometimes... but an extra cache control don't hurt. :)
-- brion
Hi,
I have a problem which I suspect is related to the recent caching changes.
I have a script that screenscrapes the Commons POTD and creates an email to send to daily-image-l@lists.wikimedia.org. But for the last three days it has sent information about the same image, from Jan 15th.
It runs this command: wget -erobots=off -q -O - http://commons.wikimedia.org/w/index.php?title=Commons:Picture_of_the_day/To...
When I visit http://commons.wikimedia.org/w/index.php?title=Commons:Picture_of_the_day/Today in my browser I get the right day, today's one.
Any ideas about this?
thanks, Brianna
On 17/01/2008, Brianna Laugher brianna.laugher@gmail.com wrote:
Hi,
I have a problem which I suspect is related to the recent caching changes.
I have a script that screenscrapes the Commons POTD and creates an email to send to daily-image-l@lists.wikimedia.org. But for the last three days it has sent information about the same image, from Jan 15th.
It runs this command: wget -erobots=off -q -O - http://commons.wikimedia.org/w/index.php?title=Commons:Picture_of_the_day/To...
When I visit http://commons.wikimedia.org/w/index.php?title=Commons:Picture_of_the_day/Today in my browser I get the right day, today's one.
Any ideas about this?
If I'm following this thread correctly, the squids cache by URL, so chances are the URL you're using is no longer getting cached. Can you just add &purge=true to the end?
On 1/17/08, Brianna Laugher brianna.laugher@gmail.com wrote:
It runs this command: wget -erobots=off -q -O - http://commons.wikimedia.org/w/index.php?title=Commons:Picture_of_the_day/To...
When I visit http://commons.wikimedia.org/w/index.php?title=Commons:Picture_of_the_day/Today in my browser I get the right day, today's one.
I can't reproduce, or at least I don't think I can (maybe I'm misinterpreting):
aryeh@aryeh-desktop:~$ wget -qO - 'http://commons.wikimedia.org/w/index.php?title=Commons:Picture_of_the_day/To...' | grep 2008-01-17 | head -n 1 |width="100%" dir="ltr"|[[Template:Potd/2008-01-17|change image]] aryeh@aryeh-desktop:~$ man wget aryeh@aryeh-desktop:~$ wget -erobots=off -qO - 'http://commons.wikimedia.org/w/index.php?title=Commons:Picture_of_the_day/To...' | grep 2008-01-17 | head -n 1 |width="100%" dir="ltr"|[[Template:Potd/2008-01-17|change image]]
If the problem is still occurring, try posting the headers, from wget -S.
On 18/01/2008, Simetrical Simetrical+wikilist@gmail.com wrote:
If the problem is still occurring, try posting the headers, from wget -S.
Hm. I think the problem is actually template recursion... due to Tim's changes described in "Message mode mess". The output is all like
=={{{day}}}== {|width="100%" border="0" cellspacing="0" cellpadding="0" style="background:transparent" |-style="vertical-align:top" |class="description {{{lang}}}" dir="{{#switch:{{{lang}}}|ar|fa|he|ur=rtl|ltr}}" | [[Image:{{Potd/{{{month}}}-{{{day}}}}}|{{{width|300}}}px|thumb|none|{{Potd/{{{month}}}-{{{day}}} ({{{lang}}})}}]] |width="100%" dir="ltr"|[[Template:Potd/{{{month}}}-{{{day}}}|{{change image|lang={{{lang}}}}}]] <ul><li class="description af" dir="ltr">[[Template:Potd/{{{month}}}-{{{day}}} (af)|Afrikaans]]: {{Potd/{{{month}}}-{{{day}}} (af)}}</li>
(Simetrical how come you got values?)
and because my script didn't get the expected input it (silently) borked and used the last saved value it had, which was the 15th apparently.
There's no way to force nested templates to evaluate?
thanks Brianna
Hi,
To combat this, I've added a default 'Cache-control: no-cache' header into our local CommonSettings.php file, which comes early in code execution.
Must be good. We may put it in front of webstart too.
wikitech-l@lists.wikimedia.org