The problem is due to recent changes that were made to how mobile caching
works. I just flushed cache on all of the frontend varnish instances which
indeed appears to have fixed the problem but it isn't actually fixed.
Note, the frontend instances just have 1GB of cache, so only very popular
objects (like the enwiki front page) avoid getting LRU'd. The backend
varnish instances utilize the ssd's and perform the heavy caching work.
When I originally built this, I had the frontends force a short (300s) ttl
on all cacheable objects, while the backends honored the times specified by
mediawiki.
I chose to only send purges to the backend instances (via wikia's old
varnishhtcpd) and let the frontend instances catch up with their short
ttls. My reasoning was:
1) Our multicast purge stream is very busy and isn't split up by cache
type, so it includes lots of purge requests for images on
upload.wikimedia.org. Processing the purges is somewhat cpu intensive, and
I saw doing so once per varnish server as preferable to twice.
2) Purges are for url's such as "en.wikipedia.org/wiki/Main_Page". The
frontend varnish instance strips the m subdomain before sending the request
onwards, but still caches content based on the request url. Purges are
never sent for "en.m.wikipedia.org/wiki/Main_Page" - every purge would need
to be rewritten to apply to the frontend varnishes. Doing this blindly
would be more expensive than it should be, since a significant percentage
of purge statements aren't applicable.
I don't think my original approach had any fans. Purges are now sent to
both varnish instances per host, and more recently, the 300s ttl override
was removed from the frontends. But all of the purges are no-ops.
There are multiple ways to approach making the purges sent to the frontends
actually work such as rewriting the purges in varnish, rewriting them
before they're sent to varnish depending on where they're being sent, or
perhaps changing how cached objects are stored in the frontend. I
personally think it's all an unnecessary waste of resources and prefer my
original approach.
-Asher
On Fri, May 3, 2013 at 2:23 PM, Arthur Richards <arichards(a)wikimedia.org>wrote;wrote:
+wikitech-l
I've confirmed the issue on my end; ?action=purge seems to have no effect
and the 'last modified' notification on the mobile main page looks correct
(though the content itself is out of date and not in sync with the 'last
modified' notification). What's doubly weird to me is the 'Last
modified'
HTTP response headers says:
Last-Modified: Tue, 30 Apr 2013 00:17:32 GMT
Which appears to be newer than when the content I'm seeing on the main
page was updated... Anyone from ops have an idea what might be going on?
On Thu, May 2, 2013 at 10:01 PM, Yuvi Panda <yuvipanda(a)gmail.com> wrote:
Encountered
https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Issue_with…
Some people seem to be having problems with the mobile main page being
cached too much. Can someone look into it?
--
Yuvi Panda T
http://yuvi.in/blog
_______________________________________________
Mobile-l mailing list
Mobile-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l
--
Arthur Richards
Software Engineer, Mobile
[[User:Awjrichards]]
IRC: awjr
+1-415-839-6885 x6687
_______________________________________________
Mobile-l mailing list
Mobile-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mobile-l