Re: [Wikitech-l] is this how our thumbnail caching works?

14 May 2014


      On Tue, May 13, 2014 at 4:13 PM, Sumana Harihareswara
sumanah@wikimedia.org wrote:
...
I am trying to figure out how thumbnail retrieval & caching works right
now - with Swift, and the frontline & secondary ("frontend" and
"backend") Varnishes. (I am working on the caching-related bit of the
performance guidelines, and want to understand and help push forward on
https://www.mediawiki.org/wiki/Requests_for_comment/Simplify_thumbnail_cache
.) I looked for docs but didn't find anything that had been updated this
year.
I was supposed to document this stuff when I first started with the
Foundation. Unfortunately I never really got it done. I've got some
notes and possibly most helpfully a diagram that I redrew in
Omigraffle based on a diagram that Faidon drew on the wall at the
office for me one day last fall. I've had this sitting around on my
local hard drive for months without uploading it anywhere, so I just
threw it up on mw.o [0].
The diagram shows the major components that you described in your
summary. Traffic from the internet for
http://upload.wikimedia.org/.../some_thumb_url.png hits a front end
LVS which routes to a frontend Varnish server. If the URL is not
cached locally by that Varnish instance, it will compute a hash of the
URL to select the backend Varnish instance that may have the content.
If the backend Varnish doesn't have the content it will request the
thumbnail from the Swift cluster. This request passes through an LVS
that selects a frontend Swift server. The frontend Swift server will
handle the request by asking the backend Swift cluster for the desired
image. If the image isn't found in the backend cluster, the frontend
Swift server will make a request to an image scaler server to have it
created. The image scalers run thumb.php from mediawiki/core.git to
fetch the original image from swift (which goes back to the same LVS
-> Swift frontend -> Swift backend path as the thumb request came
down). Once the original image is on the image scaler it will run it
through the mime type appropriate scaling software to produce a
thumbnail image. I don't remember if at this point the image is stored
in Swift by the image scaler via thumb.php's internal logic or if that
is handled by the frontend Swift server when it gets the response. In
either case, the newly created thumbnail ends up stored in the Swift
cluster and is returned as the image scaler's http response to the
frontend Swift server handling the original request. The frontend
Swift server in turn returns the thumbnail image to the backend
Varnish server which will cache it locally and then return the image
to the frontend Varnish. Finally the frontend Varnish will cache the
image response in local memory and return the image to the original
requestor.
The next time this exact thumbnail is requested, it may be found in
the frontend Varnish if the LVS routes to the same Varnish and it
hasn't been evicted from the in memory cache by time or the need to
store something newer. The image will stay in the backend Varnish
cache until it ages out based on the response headers or it is evicted
to make room for newer content. In the worst case the thumbnail will
be found in the Swift cluster where 3 copies of the thumbnail file are
stored indefinitely. The only way that the thumbnail will be removed
from Swift is when a new version of the source image is uploaded or
deleted and a purge request is sent out from the wiki.
[0]: https://www.mediawiki.org/wiki/File:Thumbnail-stack.svg
[1]: https://wikitech.wikimedia.org/wiki/Swift/Dev_Notes#Removing_NFS_from_the_sc...
Bryan
-- 
Bryan Davis              Wikimedia Foundation    bd808@wikimedia.org
[[m:User:BDavis_(WMF)]]  Sr Software Engineer            Boise, ID USA
irc: bd808                                        v:415.839.6885 x6855

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] is this how our thumbnail caching works?