I am trying to figure out how thumbnail retrieval & caching works right now - with Swift, and the frontline & secondary ("frontend" and "backend") Varnishes. (I am working on the caching-related bit of the performance guidelines, and want to understand and help push forward on https://www.mediawiki.org/wiki/Requests_for_comment/Simplify_thumbnail_cache .) I looked for docs but didn't find anything that had been updated this year.
Here's how I think it works, assuming you are a MediaWiki developer who's written, e.g., a page that includes a thumbnail of an image:
First, your code must get the metadata about the image, which might come from the local database, or memcached, or Commons. Then, you need to get a thumbnail of the image at the dimensions your page requires. Rather than create the thumbnail immediately on demand via parsing the filename and dimensions, Wikimedia's MediaWiki is configured to use the "404 handler." (see [[Manual:Thumb_handler.php]]) Your page first receives a URL indicating the eventual location of the thumbnail, then the browser asks for that URL. If it hasn't been created yet, the web server initially gets an internal 404 error; the 404 handler then kicks off the thumbnailer to create the thumbnail, and the response gets sent to the client.
As it is sent to the client, each thumbnail is stored in a Swift store and stored in our frontline and secondary Varnish caches.
(The Varnish caches cache entire HTTP responses, including thumbnails of images, frequently-requested pages, ResourceLoader modules, and similar items that can be retrieved by URL. The frontline Varnishes keep these in memory. (A weighted-random load balancer (LVS) distributes web requests to the front-end Varnishes.) But if a frontline Varnish doesn't have a response cached, it passes the request to the secondary Varnishes via hash-based load balancing (on the hash of the URL). The secondary Varnishes hold more responses, storing them ondisk. Every URL is on at most one secondary Varnish.)
So, at the end of this whole process, any given thumbnail is in: * the Swift thumbnail store (and will persist until the canonical image changes, or is deleted, or we run out of space and flush Swift) * the frontline and secondary Varnishes (and will persist until the canonical image changes, or is deleted, or we restart the frontline Varnishes or we evict data from the hard disks of the secondary Varnishes)
Is this right?