what is the main use case?
- as maintainers of the wikimedia media file servers, we want to reduce the number of images cached in order to save storage space and cost?
- and/or something else?
is it possible to cache based on a last accessed timestamp?
- if an image size has not been accessed within x number of days purge it from the cache
with kind regards, dan
On Aug 13, 2014, at 11:18 , Neil Kandalgaonkar neilk@neilk.net wrote:
I think I need more context. Is this what you're saying?
- we want to use less storage space
- images we are generating and caching for not-Wikipedia should be the first to go
- we assume weird sizes are from not-Wikipedia. So let's cache them for less time
- except, that doesn't work, because of tall images
- so maybe we should change the image request format?
If this is accurate I have a few questions:
- If you want to prioritize Wiki[mp]edia thumbnails, why not use the referrer header instead? Why use the width parameter to detect this?
- Are we sure we'll improve overall performance by evicting certain files from cache quicker? Why not trust the LRU cache algorithm?
On 8/13/14, 1:36 AM, Gilles Dubuc wrote:
Currently the file page provides a set of different image sizes for the user to directly access. These sizes are usually width-based. However, for tall images they are height-based. The thumbnail urls, which are used to generate them pass only a width.
What this means is that tall images end up with arbitrary thumbnail widths that don't follow the set of sizes meant for the file page. The end result from an ops perspective is that we end up with very diverse widths for thumbnails. Not a problem in itself, but the exposure of these random-ish widths on the file page means that we can't set a different caching policy for non-standard widths without affecting the images linked from the file page.
I see two solutions to this problem, if we want to introduce different caching tiers for thumbnail sizes that come from mediawiki and thumbnail sizes that were requested by other things.
The first one would be to always keep the size progression on the file page width-bound, even for soft-rotated images. The first drawback of this is that for very skinny/very wide images the file size progression between the sizes could become steep. The second drawback is that we'd often offer less size options, because they'd be based on the smallest dimension.
The second option would be to change the syntax of the thumbnail urls in order to allow height constraint. This is a pretty scary change.
If we don't do anything, it simply means that we'll have to apply the same caching policy to every size smaller than 1280. We could already save quite a bit of storage space by evicting non-standard sizes larger than that, but sizes lower than 1280 would have to stay the way they are now.
Thoughts?
Multimedia mailing list
Multimedia@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/multimedia
-- Neil Kandalgaonkar (| neilk@neilk.net _______________________________________________ Multimedia mailing list Multimedia@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/multimedia