On Wed, Aug 13, 2014 at 10:36 AM, Gilles Dubuc <gilles@wikimedia.org> wrote:
Currently the file page provides a set of different image sizes for the user to directly access. These sizes are usually width-based. However, for tall images they are height-based. The thumbnail urls, which are used to generate them pass only a width.

What this means is that tall images end up with arbitrary thumbnail widths that don't follow the set of sizes meant for the file page. The end result from an ops perspective is that we end up with very diverse widths for thumbnails. Not a problem in itself, but the exposure of these random-ish widths on the file page means that we can't set a different caching policy for non-standard widths without affecting the images linked from the file page.

I see two solutions to this problem, if we want to introduce different caching tiers for thumbnail sizes that come from mediawiki and thumbnail sizes that were requested by other things.

The first one would be to always keep the size progression on the file page width-bound, even for soft-rotated images. The first drawback of this is that for very skinny/very wide images the file size progression between the sizes could become steep. The second drawback is that we'd often offer less size options, because they'd be based on the smallest dimension.

The second option would be to change the syntax of the thumbnail urls in order to allow height constraint. This is a pretty scary change.

If we don't do anything, it simply means that we'll have to apply the same caching policy to every size smaller than 1280. We could already save quite a bit of storage space by evicting non-standard sizes larger than that, but sizes lower than 1280 would have to stay the way they are now.

Thoughts?

A workaround would be to add an extra parameter to height-constrained images, such as 623px-heightconstrained-Foo.png (as far as I can see, this is a non-scary change), and not purge files which have this parameter. A bit messy, but if the purging itself is also a temporary workaround until we have a new storage mechanism with proper usage-based cache eviction, there is no point messing with fundamentals of the thumbnail url syntax.