Image scaling proposal: server-side mip-mapping

List overview All Threads
Download

newer

older

Welcome Bernd Sitzmann as Software...

Welcome, Mukunda Modell, new...

Brion Vibber

30 Apr 2014 30 Apr '14

7:51 p.m.

There've been some issues reported lately with image scaling, where resource usage on very large images has been huge (problematic for batch uploads from a high-resolution source). Even scaling time for typical several-megapixel JPEG photos can be slower than desired when loading up into something like the MMV extension.

I've previously proposed limiting the generatable thumb sizes and pre-generating those fixed sizes at upload time, but this hasn't been a popular idea because of the lack of flexibility and potentially poor client-side scaling or inefficient network use sending larger-than-needed fixed image sizes.

Here's an idea that blends the performance benefits of pre-scaling with the flexibility of our current model...

A classic technique in 3d graphics is mip-mappinghttps://en.wikipedia.org/wiki/Mip-mapping, where an image is pre-scaled to multiple resolutions, usually each 1/2 the width and height of the next level up.

When drawing a textured polygon on screen, the system picks the most closely-sized level of the mipmap to draw, reducing the resources needed and avoiding some classes of aliasing/moiré patterns when scaling down. If you want to get fancy you can also use trilinear filteringhttps://en.wikipedia.org/wiki/Trilinear_filtering, where the next-size-up and next-size-down mip-map levels are combined -- this further reduces artifacting.

I'm wondering if we can use this technique to help with scaling of very large images: * at upload time, perform a series of scales to produce the mipmap levels * _don't consider the upload complete_ until those are done! a web uploader or API-using bot should probably wait until it's done before uploading the next file, for instance... * once upload is complete, keep on making user-facing thumbnails as before... but make them from the smaller mipmap levels instead of the full-scale original

This would avoid changing our external model -- where server-side scaling can be used to produce arbitrary-size images that are well-optimized for their target size -- while reducing resource usage for thumbs of huge source images. We can also still do things like applying a sharpening effect on photos, which people sorely miss when it's missing.

If there's interest in investigating this scenario I can write up an RfC with some more details.

(Properly handling multi-page files like PDFs, DjVu, or paged TIFFs could complicate this by making the initial rendering extraction pretty slow, though, so that needs consideration.)

-- brion

Show replies by date

Gabriel Wicke

30 Apr 30 Apr

10:02 p.m.

On 04/30/2014 12:51 PM, Brion Vibber wrote:

...

at upload time, perform a series of scales to produce the mipmap levels

_don't consider the upload complete_ until those are done! a web uploader

or API-using bot should probably wait until it's done before uploading the next file, for instance...

once upload is complete, keep on making user-facing thumbnails as

before... but make them from the smaller mipmap levels instead of the full-scale original

...

If there's interest in investigating this scenario I can write up an RfC with some more details.

Yes, please do! This is very close to what Aaron & me have been discussing recently on the ops list as well.

Gabriel

Erwin Dokter

1 May 1 May

10:21 a.m.

On 04/30/2014 12:51 PM, Brion Vibber wrote:

...

at upload time, perform a series of scales to produce the mipmap levels

_don't consider the upload complete_ until those are done! a web uploader

or API-using bot should probably wait until it's done before uploading the next file, for instance...

once upload is complete, keep on making user-facing thumbnails as

before... but make them from the smaller mipmap levels instead of the full-scale original

Would it not suffice to just produce *one* scaled down version (ie. 2048px) which the real-time scaler can use to produce the thumbs?

Regards,

-- Erwin Dokter

Gilles Dubuc

10:54 a.m.

...

_don't consider the upload complete_ until those are done! a web uploader or API-using bot should probably wait until it's done before uploading the next file, for instance...

You got me a little confused at that point, are you talking about the client generating the intermediary sizes, or the server?

I think client-side thumbnail generation is risky when things start getting corrupt. A client-side bug could result in a user uploading thumbnails that are for a different image. And if you want to run a visual signature check on the server-side to avoid that issue, you might be looking at similar processing time checking that the thumbnail is for the correct image then if the server was to generate the actual thumbnail. It would be worth researching if there's a very fast "is this thumbnail a smaller version of that image" algorithm out there. We don't need 100% confidence either, if we're looking to avoid shuffling bugs in a given upload batch.

Regarding the issue of a single intermediary size versus multiple, there's still a near-future plan to have pregenerated buckets for Media Viewer (which can be reused for a whole host of other things). Those could be used like mip-maps like you describe. Since these sizes will be generated at upload time, why not use them?

However quality starts to introduce noticeable visual artifacts when the bucket (source image)'s dimensions are too close to the thumbnail you want to render.

Consider the existing Media Viewer width buckets: 320, 640, 800, 1024, 1280, 1920, 2560, 2880

I think that generating the 300px thumbnail based on the 320 bucket is likely to introduce very visible artifacts with thin lines, etc. compared to using the biggest bucket (2880px). Maybe there's a smart compromise, like picking the higher bucket (eg. 300px thumbnail would use the 640 bucket as its source, etc.). I think that we need a battery of visual test to determine what's the best strategy here.

All of this is dependent on Ops giving the green light for pregenerating the buckets, though. The swift capacity for it is slowing being brought online, but I think Ops' prerequisite wish to saying yes to it is that we focus on the post-swift strategy for thumbnails. We also need to figure out the performance impact of generating all these thumbnails on upload. On a very meta note, we might generate the smaller buckets based on the biggest bucket and only the 2-3 biggest buckets based on the original (still to avoid visual artifacts).

Another related angle I'd like to explore is to submit a simplified version of this RFC: https://www.mediawiki.org/wiki/Requests_for_comment/Standardized_thumbnails_... we'd propose a single bucket list option instead of multiple (presumably, the Media Viewer ones, if not, we'd update Media Viewer to use the new canon list of buckets). And where we would still allow arbitrary thumbnail sizes below a certain limit. For example, people would still be allowed to request thumbnails that are smaller than 800px at any size they want, because these are likely to be thumbnails in the real sense of the term, and for anything above 800px they would be limited to the available buckets (eg. 1024, 1280, 1920, 2560, 2880). This would still allow foundation-hosted wikis to have flexible layout strategies with their thumbnail sizes, while reducing the craziness of this attack vector on the image Scalers and gigantic waste of disk and memory space on the thumbnail hosting. I think it would be an easier sell for the community, the current RFC is too extreme in banning all arbitrary sizes and offers too many bucketing options. I feel like the standardization of true thumbnail sizes (small images, <800px) is much more subject to endless debate with no consensus.

On Thu, May 1, 2014 at 12:21 PM, Erwin Dokter erwin@darcoury.nl wrote:

...

On 04/30/2014 12:51 PM, Brion Vibber wrote:

...

at upload time, perform a series of scales to produce the mipmap levels

_don't consider the upload complete_ until those are done! a web

uploader or API-using bot should probably wait until it's done before uploading the next file, for instance...

once upload is complete, keep on making user-facing thumbnails as

before... but make them from the smaller mipmap levels instead of the full-scale original

Would it not suffice to just produce *one* scaled down version (ie. 2048px) which the real-time scaler can use to produce the thumbs?

Regards,

Erwin Dokter

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Gilles Dubuc

2:02 p.m.

Another point about picking the "one true bucket list": currently Media Viewer's buckets have been picked based on the most common screen resolutions, because Media Viewer tries to always use the entire width of the screen to display the image, so trying to achieve a 1-to-1 pixel correspondence makes sense, because it should give the sharpest result possible to the average user.

However, sticking to that approach will likely introduce a cost. As I've just mentioned, we will probably need to generate more than one of the high buckets based on the original, in order to avoid resizing artifacts.

On the other hand, we could decide that the unified bucket list shouldn't be based on screen resolutions (after all the full width display scenario experienced in Media Viewer might be the exception, and the buckets will be for everything mediawiki) and instead would progress by powers of 2. Then creating a given bucket could always be done without resizing artifacts, based on the bucket above the current one. This should provide the biggest savings possible in image scaling time to generate thumbnail buckets.

To illustrate with an example, the bucket list could be: 256, 512, 1024, 2048, 4096. The 4096 bucket would be generated first, based on the original, then 2048 would be generated based on 4096, then 1024 based on 2048, etc.

The big downside is that there's less progression in the 1000-3000 range (4 buckets in the Media Viewer resolution-oriented strategy, 2 buckets here) where the majority of devices currently are. If I take a test image as an example (https://commons.wikimedia.org/wiki/File:Swallow_flying_drinking.jpg), the file size progression is quite different between the screen resolution buckets and the geometric (powers of 2) buckets:

- screen resolution buckets 320 11.7kb 640 17kb 800 37.9kb 1024 58kb 1280 89.5kb 1920 218.9kb 2560 324.6kb 2880 421.5kb

- geometric buckets 256 9.4kb 512 20kb 1024 58kb 2048 253.1kb 4096 test image is smaller than 4096

It seems like it's not ideal that a screen resolution slightly above 1024 would suddenly need to download an image 5 times as heavy, for not that many extra pixels on the actual screen. A similar thing can be said for the screen resolution progression, where the file size more than doubles between 1280 and 1920. We could probably use at least an extra step between those two if we use screen resolution buckets, like 1366 and/or 1440.

I think that the issue of buckets between 1000 and 3000 is tricky, it's going to be difficult to avoid generating them based on the original while not getting visual artifacts.

Maybe we can get away with generating 1280 (and possibly 1366, 1440) based on 2048, the distance between the two guaranteeing that the quality issues will be negligible. We definitely can't generate a 1920 based on a 2048 thumbnail, though, otherwise artifacts on thin lines will look awful.

A mixed progression like this might be the best of both worlds, if we confirm that between 1024 and 2048 the resizing is artifact-free enough:

256, 512, 1024, 1280, 1366, 1440, 2048, 4096 where 2048 is generated based on 4096, 1024, 1280, 1366, 1440 are generated based on 2048, 512 based on 1024, 256 based on 512.

If for example the image width is between 1440 and 2048, then 1024, 1280, 1366, 1440 would be generated based on the original. That's fine performance-wise, since the original is small.

Something that might also be useful to generate is a thumbnail of the same size as the original if original < 4096 (or whatever the highest bucket is). Currently we seem to block generating such a thumbnail, but the difference in file size is huge. For the test image mentioned above, which is 3002 pixels wide, the original is 3.69MB, while a thumbnail of the same size would be 465kb. For the benefit of retina displays that are 2560/2880, displaying a thumbnail of the same size as a 3002 original would definitely be better than the highest available bucket (2048).

All of this is benchmark-worthy anyway, I might be splitting hair looking for powers of two if rendering a bucket chain (each bucket generated based on the next one) isn't that much faster than generating all buckets based on the biggest bucket.

On Thu, May 1, 2014 at 12:54 PM, Gilles Dubuc gilles@wikimedia.org wrote:

...

_don't consider the upload complete_ until those are done! a web uploader

...
or API-using bot should probably wait until it's done before uploading the next file, for instance...

You got me a little confused at that point, are you talking about the client generating the intermediary sizes, or the server?

I think client-side thumbnail generation is risky when things start getting corrupt. A client-side bug could result in a user uploading thumbnails that are for a different image. And if you want to run a visual signature check on the server-side to avoid that issue, you might be looking at similar processing time checking that the thumbnail is for the correct image then if the server was to generate the actual thumbnail. It would be worth researching if there's a very fast "is this thumbnail a smaller version of that image" algorithm out there. We don't need 100% confidence either, if we're looking to avoid shuffling bugs in a given upload batch.

Regarding the issue of a single intermediary size versus multiple, there's still a near-future plan to have pregenerated buckets for Media Viewer (which can be reused for a whole host of other things). Those could be used like mip-maps like you describe. Since these sizes will be generated at upload time, why not use them?

However quality starts to introduce noticeable visual artifacts when the bucket (source image)'s dimensions are too close to the thumbnail you want to render.

Consider the existing Media Viewer width buckets: 320, 640, 800, 1024, 1280, 1920, 2560, 2880

I think that generating the 300px thumbnail based on the 320 bucket is likely to introduce very visible artifacts with thin lines, etc. compared to using the biggest bucket (2880px). Maybe there's a smart compromise, like picking the higher bucket (eg. 300px thumbnail would use the 640 bucket as its source, etc.). I think that we need a battery of visual test to determine what's the best strategy here.

All of this is dependent on Ops giving the green light for pregenerating the buckets, though. The swift capacity for it is slowing being brought online, but I think Ops' prerequisite wish to saying yes to it is that we focus on the post-swift strategy for thumbnails. We also need to figure out the performance impact of generating all these thumbnails on upload. On a very meta note, we might generate the smaller buckets based on the biggest bucket and only the 2-3 biggest buckets based on the original (still to avoid visual artifacts).

Another related angle I'd like to explore is to submit a simplified version of this RFC: https://www.mediawiki.org/wiki/Requests_for_comment/Standardized_thumbnails_... we'd propose a single bucket list option instead of multiple (presumably, the Media Viewer ones, if not, we'd update Media Viewer to use the new canon list of buckets). And where we would still allow arbitrary thumbnail sizes below a certain limit. For example, people would still be allowed to request thumbnails that are smaller than 800px at any size they want, because these are likely to be thumbnails in the real sense of the term, and for anything above 800px they would be limited to the available buckets (eg. 1024, 1280, 1920, 2560, 2880). This would still allow foundation-hosted wikis to have flexible layout strategies with their thumbnail sizes, while reducing the craziness of this attack vector on the image Scalers and gigantic waste of disk and memory space on the thumbnail hosting. I think it would be an easier sell for the community, the current RFC is too extreme in banning all arbitrary sizes and offers too many bucketing options. I feel like the standardization of true thumbnail sizes (small images, <800px) is much more subject to endless debate with no consensus.

On Thu, May 1, 2014 at 12:21 PM, Erwin Dokter erwin@darcoury.nl wrote:

...
On 04/30/2014 12:51 PM, Brion Vibber wrote:

...

at upload time, perform a series of scales to produce the mipmap levels

_don't consider the upload complete_ until those are done! a web

uploader or API-using bot should probably wait until it's done before uploading the next file, for instance...

once upload is complete, keep on making user-facing thumbnails as

before... but make them from the smaller mipmap levels instead of the full-scale original

Would it not suffice to just produce *one* scaled down version (ie. 2048px) which the real-time scaler can use to produce the thumbs?

Regards,

Erwin Dokter

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Gilles Dubuc

2:57 p.m.

An extremely crude benchmark on our multimedia labs instance, still using the same test image:

original -> 3002 (original size) 0m0.268s original -> 2048 0m1.344s original -> 1024 0m0.856s original -> 512 0m0.740s original -> 256 0m0.660s 2048 -> 1024 0m0.444s 2048 -> 512 0m0.332s 2048 -> 256 0m0.284s 1024 -> 512 0m0.112s 512 -> 256 0m0.040s

Which confirms that chaining instead of generating all thumbnails based on the biggest bucket saves a significant amount of processing time. It's definitely in the same order or magnitude as the savings achieved by going from original as the source to the biggest bucket as the source.

It's also worth noting that generating the thumbnail of the same size as the original is relatively cheap. Using it as the source for the 2048 image doesn't save that much time, though: 0m1.252s (for 3002 -> 2048).

And here's a side-by-side comparison of these images generated with chaining and images that come from our regular image scalers: https://dl.dropboxusercontent.com/u/109867/imagickchaining/index.html Try to guess which is which before inspecting the page for the answer :)

On Thu, May 1, 2014 at 4:02 PM, Gilles Dubuc gilles@wikimedia.org wrote:

...

Another point about picking the "one true bucket list": currently Media Viewer's buckets have been picked based on the most common screen resolutions, because Media Viewer tries to always use the entire width of the screen to display the image, so trying to achieve a 1-to-1 pixel correspondence makes sense, because it should give the sharpest result possible to the average user.

However, sticking to that approach will likely introduce a cost. As I've just mentioned, we will probably need to generate more than one of the high buckets based on the original, in order to avoid resizing artifacts.

On the other hand, we could decide that the unified bucket list shouldn't be based on screen resolutions (after all the full width display scenario experienced in Media Viewer might be the exception, and the buckets will be for everything mediawiki) and instead would progress by powers of 2. Then creating a given bucket could always be done without resizing artifacts, based on the bucket above the current one. This should provide the biggest savings possible in image scaling time to generate thumbnail buckets.

To illustrate with an example, the bucket list could be: 256, 512, 1024, 2048, 4096. The 4096 bucket would be generated first, based on the original, then 2048 would be generated based on 4096, then 1024 based on 2048, etc.

The big downside is that there's less progression in the 1000-3000 range (4 buckets in the Media Viewer resolution-oriented strategy, 2 buckets here) where the majority of devices currently are. If I take a test image as an example ( https://commons.wikimedia.org/wiki/File:Swallow_flying_drinking.jpg), the file size progression is quite different between the screen resolution buckets and the geometric (powers of 2) buckets:

screen resolution buckets

320 11.7kb 640 17kb 800 37.9kb 1024 58kb 1280 89.5kb 1920 218.9kb 2560 324.6kb 2880 421.5kb

geometric buckets

256 9.4kb 512 20kb 1024 58kb 2048 253.1kb 4096 test image is smaller than 4096

It seems like it's not ideal that a screen resolution slightly above 1024 would suddenly need to download an image 5 times as heavy, for not that many extra pixels on the actual screen. A similar thing can be said for the screen resolution progression, where the file size more than doubles between 1280 and 1920. We could probably use at least an extra step between those two if we use screen resolution buckets, like 1366 and/or 1440.

I think that the issue of buckets between 1000 and 3000 is tricky, it's going to be difficult to avoid generating them based on the original while not getting visual artifacts.

Maybe we can get away with generating 1280 (and possibly 1366, 1440) based on 2048, the distance between the two guaranteeing that the quality issues will be negligible. We definitely can't generate a 1920 based on a 2048 thumbnail, though, otherwise artifacts on thin lines will look awful.

A mixed progression like this might be the best of both worlds, if we confirm that between 1024 and 2048 the resizing is artifact-free enough:

256, 512, 1024, 1280, 1366, 1440, 2048, 4096 where 2048 is generated based on 4096, 1024, 1280, 1366, 1440 are generated based on 2048, 512 based on 1024, 256 based on 512.

If for example the image width is between 1440 and 2048, then 1024, 1280, 1366, 1440 would be generated based on the original. That's fine performance-wise, since the original is small.

Something that might also be useful to generate is a thumbnail of the same size as the original if original < 4096 (or whatever the highest bucket is). Currently we seem to block generating such a thumbnail, but the difference in file size is huge. For the test image mentioned above, which is 3002 pixels wide, the original is 3.69MB, while a thumbnail of the same size would be 465kb. For the benefit of retina displays that are 2560/2880, displaying a thumbnail of the same size as a 3002 original would definitely be better than the highest available bucket (2048).

All of this is benchmark-worthy anyway, I might be splitting hair looking for powers of two if rendering a bucket chain (each bucket generated based on the next one) isn't that much faster than generating all buckets based on the biggest bucket.

On Thu, May 1, 2014 at 12:54 PM, Gilles Dubuc gilles@wikimedia.orgwrote:

...
_don't consider the upload complete_ until those are done! a web uploader

...
or API-using bot should probably wait until it's done before uploading the next file, for instance...

You got me a little confused at that point, are you talking about the client generating the intermediary sizes, or the server?

I think client-side thumbnail generation is risky when things start getting corrupt. A client-side bug could result in a user uploading thumbnails that are for a different image. And if you want to run a visual signature check on the server-side to avoid that issue, you might be looking at similar processing time checking that the thumbnail is for the correct image then if the server was to generate the actual thumbnail. It would be worth researching if there's a very fast "is this thumbnail a smaller version of that image" algorithm out there. We don't need 100% confidence either, if we're looking to avoid shuffling bugs in a given upload batch.

Regarding the issue of a single intermediary size versus multiple, there's still a near-future plan to have pregenerated buckets for Media Viewer (which can be reused for a whole host of other things). Those could be used like mip-maps like you describe. Since these sizes will be generated at upload time, why not use them?

However quality starts to introduce noticeable visual artifacts when the bucket (source image)'s dimensions are too close to the thumbnail you want to render.

Consider the existing Media Viewer width buckets: 320, 640, 800, 1024, 1280, 1920, 2560, 2880

I think that generating the 300px thumbnail based on the 320 bucket is likely to introduce very visible artifacts with thin lines, etc. compared to using the biggest bucket (2880px). Maybe there's a smart compromise, like picking the higher bucket (eg. 300px thumbnail would use the 640 bucket as its source, etc.). I think that we need a battery of visual test to determine what's the best strategy here.

All of this is dependent on Ops giving the green light for pregenerating the buckets, though. The swift capacity for it is slowing being brought online, but I think Ops' prerequisite wish to saying yes to it is that we focus on the post-swift strategy for thumbnails. We also need to figure out the performance impact of generating all these thumbnails on upload. On a very meta note, we might generate the smaller buckets based on the biggest bucket and only the 2-3 biggest buckets based on the original (still to avoid visual artifacts).

Another related angle I'd like to explore is to submit a simplified version of this RFC: https://www.mediawiki.org/wiki/Requests_for_comment/Standardized_thumbnails_... we'd propose a single bucket list option instead of multiple (presumably, the Media Viewer ones, if not, we'd update Media Viewer to use the new canon list of buckets). And where we would still allow arbitrary thumbnail sizes below a certain limit. For example, people would still be allowed to request thumbnails that are smaller than 800px at any size they want, because these are likely to be thumbnails in the real sense of the term, and for anything above 800px they would be limited to the available buckets (eg. 1024, 1280, 1920, 2560, 2880). This would still allow foundation-hosted wikis to have flexible layout strategies with their thumbnail sizes, while reducing the craziness of this attack vector on the image Scalers and gigantic waste of disk and memory space on the thumbnail hosting. I think it would be an easier sell for the community, the current RFC is too extreme in banning all arbitrary sizes and offers too many bucketing options. I feel like the standardization of true thumbnail sizes (small images, <800px) is much more subject to endless debate with no consensus.

On Thu, May 1, 2014 at 12:21 PM, Erwin Dokter erwin@darcoury.nl wrote:

...
On 04/30/2014 12:51 PM, Brion Vibber wrote:

...

at upload time, perform a series of scales to produce the mipmap

levels

_don't consider the upload complete_ until those are done! a web

uploader or API-using bot should probably wait until it's done before uploading the next file, for instance...

once upload is complete, keep on making user-facing thumbnails as

before... but make them from the smaller mipmap levels instead of the full-scale original

Would it not suffice to just produce *one* scaled down version (ie. 2048px) which the real-time scaler can use to produce the thumbs?

Regards,

Erwin Dokter

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Rob Lanphier

4:09 p.m.

Hi Gilles,

Thanks for the comparison images. When I was playing around with this a while back, I found that images with lots of parallel lines and lots of easily recognized detail were the best to see what sorts of problems rescaling can cause. Here's a few

https://commons.wikimedia.org/wiki/File:13-11-02-olb-by-RalfR-03.jpg https://commons.wikimedia.org/wiki/File:Basel_-_M%C3%BCnsterpfalz1.jpg https://commons.wikimedia.org/wiki/File:Bouquiniste_Paris.jpg

I hadn't used faces, but a good group photo with easily recognized faces is something else to possibly try (our brains are really good at spotting subtle differences in faces).

Rob

On Thu, May 1, 2014 at 7:57 AM, Gilles Dubuc gilles@wikimedia.org wrote:

...

An extremely crude benchmark on our multimedia labs instance, still using the same test image:

original -> 3002 (original size) 0m0.268s original -> 2048 0m1.344s original -> 1024 0m0.856s original -> 512 0m0.740s original -> 256 0m0.660s 2048 -> 1024 0m0.444s 2048 -> 512 0m0.332s 2048 -> 256 0m0.284s 1024 -> 512 0m0.112s 512 -> 256 0m0.040s

Which confirms that chaining instead of generating all thumbnails based on the biggest bucket saves a significant amount of processing time. It's definitely in the same order or magnitude as the savings achieved by going from original as the source to the biggest bucket as the source.

It's also worth noting that generating the thumbnail of the same size as the original is relatively cheap. Using it as the source for the 2048 image doesn't save that much time, though: 0m1.252s (for 3002 -> 2048).

And here's a side-by-side comparison of these images generated with chaining and images that come from our regular image scalers: https://dl.dropboxusercontent.com/u/109867/imagickchaining/index.html Try to guess which is which before inspecting the page for the answer :)

On Thu, May 1, 2014 at 4:02 PM, Gilles Dubuc gilles@wikimedia.org wrote:

...
Another point about picking the "one true bucket list": currently Media Viewer's buckets have been picked based on the most common screen resolutions, because Media Viewer tries to always use the entire width of the screen to display the image, so trying to achieve a 1-to-1 pixel correspondence makes sense, because it should give the sharpest result possible to the average user.

However, sticking to that approach will likely introduce a cost. As I've just mentioned, we will probably need to generate more than one of the high buckets based on the original, in order to avoid resizing artifacts.

On the other hand, we could decide that the unified bucket list shouldn't be based on screen resolutions (after all the full width display scenario experienced in Media Viewer might be the exception, and the buckets will be for everything mediawiki) and instead would progress by powers of 2. Then creating a given bucket could always be done without resizing artifacts, based on the bucket above the current one. This should provide the biggest savings possible in image scaling time to generate thumbnail buckets.

To illustrate with an example, the bucket list could be: 256, 512, 1024, 2048, 4096. The 4096 bucket would be generated first, based on the original, then 2048 would be generated based on 4096, then 1024 based on 2048, etc.

The big downside is that there's less progression in the 1000-3000 range (4 buckets in the Media Viewer resolution-oriented strategy, 2 buckets here) where the majority of devices currently are. If I take a test image as an example ( https://commons.wikimedia.org/wiki/File:Swallow_flying_drinking.jpg), the file size progression is quite different between the screen resolution buckets and the geometric (powers of 2) buckets:

screen resolution buckets

320 11.7kb 640 17kb 800 37.9kb 1024 58kb 1280 89.5kb 1920 218.9kb 2560 324.6kb 2880 421.5kb

geometric buckets

256 9.4kb 512 20kb 1024 58kb 2048 253.1kb 4096 test image is smaller than 4096

It seems like it's not ideal that a screen resolution slightly above 1024 would suddenly need to download an image 5 times as heavy, for not that many extra pixels on the actual screen. A similar thing can be said for the screen resolution progression, where the file size more than doubles between 1280 and 1920. We could probably use at least an extra step between those two if we use screen resolution buckets, like 1366 and/or 1440.

I think that the issue of buckets between 1000 and 3000 is tricky, it's going to be difficult to avoid generating them based on the original while not getting visual artifacts.

Maybe we can get away with generating 1280 (and possibly 1366, 1440) based on 2048, the distance between the two guaranteeing that the quality issues will be negligible. We definitely can't generate a 1920 based on a 2048 thumbnail, though, otherwise artifacts on thin lines will look awful.

A mixed progression like this might be the best of both worlds, if we confirm that between 1024 and 2048 the resizing is artifact-free enough:

256, 512, 1024, 1280, 1366, 1440, 2048, 4096 where 2048 is generated based on 4096, 1024, 1280, 1366, 1440 are generated based on 2048, 512 based on 1024, 256 based on 512.

If for example the image width is between 1440 and 2048, then 1024, 1280, 1366, 1440 would be generated based on the original. That's fine performance-wise, since the original is small.

Something that might also be useful to generate is a thumbnail of the same size as the original if original < 4096 (or whatever the highest bucket is). Currently we seem to block generating such a thumbnail, but the difference in file size is huge. For the test image mentioned above, which is 3002 pixels wide, the original is 3.69MB, while a thumbnail of the same size would be 465kb. For the benefit of retina displays that are 2560/2880, displaying a thumbnail of the same size as a 3002 original would definitely be better than the highest available bucket (2048).

All of this is benchmark-worthy anyway, I might be splitting hair looking for powers of two if rendering a bucket chain (each bucket generated based on the next one) isn't that much faster than generating all buckets based on the biggest bucket.

On Thu, May 1, 2014 at 12:54 PM, Gilles Dubuc gilles@wikimedia.orgwrote:

...
_don't consider the upload complete_ until those are done! a web uploader

...
or API-using bot should probably wait until it's done before uploading the next file, for instance...

You got me a little confused at that point, are you talking about the client generating the intermediary sizes, or the server?

I think client-side thumbnail generation is risky when things start getting corrupt. A client-side bug could result in a user uploading thumbnails that are for a different image. And if you want to run a visual signature check on the server-side to avoid that issue, you might be looking at similar processing time checking that the thumbnail is for the correct image then if the server was to generate the actual thumbnail. It would be worth researching if there's a very fast "is this thumbnail a smaller version of that image" algorithm out there. We don't need 100% confidence either, if we're looking to avoid shuffling bugs in a given upload batch.

Regarding the issue of a single intermediary size versus multiple, there's still a near-future plan to have pregenerated buckets for Media Viewer (which can be reused for a whole host of other things). Those could be used like mip-maps like you describe. Since these sizes will be generated at upload time, why not use them?

However quality starts to introduce noticeable visual artifacts when the bucket (source image)'s dimensions are too close to the thumbnail you want to render.

Consider the existing Media Viewer width buckets: 320, 640, 800, 1024, 1280, 1920, 2560, 2880

I think that generating the 300px thumbnail based on the 320 bucket is likely to introduce very visible artifacts with thin lines, etc. compared to using the biggest bucket (2880px). Maybe there's a smart compromise, like picking the higher bucket (eg. 300px thumbnail would use the 640 bucket as its source, etc.). I think that we need a battery of visual test to determine what's the best strategy here.

All of this is dependent on Ops giving the green light for pregenerating the buckets, though. The swift capacity for it is slowing being brought online, but I think Ops' prerequisite wish to saying yes to it is that we focus on the post-swift strategy for thumbnails. We also need to figure out the performance impact of generating all these thumbnails on upload. On a very meta note, we might generate the smaller buckets based on the biggest bucket and only the 2-3 biggest buckets based on the original (still to avoid visual artifacts).

Another related angle I'd like to explore is to submit a simplified version of this RFC: https://www.mediawiki.org/wiki/Requests_for_comment/Standardized_thumbnails_... we'd propose a single bucket list option instead of multiple (presumably, the Media Viewer ones, if not, we'd update Media Viewer to use the new canon list of buckets). And where we would still allow arbitrary thumbnail sizes below a certain limit. For example, people would still be allowed to request thumbnails that are smaller than 800px at any size they want, because these are likely to be thumbnails in the real sense of the term, and for anything above 800px they would be limited to the available buckets (eg. 1024, 1280, 1920, 2560, 2880). This would still allow foundation-hosted wikis to have flexible layout strategies with their thumbnail sizes, while reducing the craziness of this attack vector on the image Scalers and gigantic waste of disk and memory space on the thumbnail hosting. I think it would be an easier sell for the community, the current RFC is too extreme in banning all arbitrary sizes and offers too many bucketing options. I feel like the standardization of true thumbnail sizes (small images, <800px) is much more subject to endless debate with no consensus.

On Thu, May 1, 2014 at 12:21 PM, Erwin Dokter erwin@darcoury.nl wrote:

...
On 04/30/2014 12:51 PM, Brion Vibber wrote:

...

at upload time, perform a series of scales to produce the mipmap

levels

_don't consider the upload complete_ until those are done! a web

uploader or API-using bot should probably wait until it's done before uploading the next file, for instance...

once upload is complete, keep on making user-facing thumbnails as

before... but make them from the smaller mipmap levels instead of the full-scale original

Would it not suffice to just produce *one* scaled down version (ie. 2048px) which the real-time scaler can use to produce the thumbs?

Regards,

Erwin Dokter

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Max Semenik

4:45 p.m.

The slowest part of large images scaling in production is their retrieval from Swift, which could be much faster for bucketed images.

On Thu, May 1, 2014 at 7:57 AM, Gilles Dubuc gilles@wikimedia.org wrote:

...

An extremely crude benchmark on our multimedia labs instance, still using the same test image:

original -> 3002 (original size) 0m0.268s original -> 2048 0m1.344s original -> 1024 0m0.856s original -> 512 0m0.740s original -> 256 0m0.660s 2048 -> 1024 0m0.444s 2048 -> 512 0m0.332s 2048 -> 256 0m0.284s 1024 -> 512 0m0.112s 512 -> 256 0m0.040s

Which confirms that chaining instead of generating all thumbnails based on the biggest bucket saves a significant amount of processing time. It's definitely in the same order or magnitude as the savings achieved by going from original as the source to the biggest bucket as the source.

It's also worth noting that generating the thumbnail of the same size as the original is relatively cheap. Using it as the source for the 2048 image doesn't save that much time, though: 0m1.252s (for 3002 -> 2048).

And here's a side-by-side comparison of these images generated with chaining and images that come from our regular image scalers: https://dl.dropboxusercontent.com/u/109867/imagickchaining/index.html Try to guess which is which before inspecting the page for the answer :)

On Thu, May 1, 2014 at 4:02 PM, Gilles Dubuc gilles@wikimedia.org wrote:

...
Another point about picking the "one true bucket list": currently Media Viewer's buckets have been picked based on the most common screen resolutions, because Media Viewer tries to always use the entire width of the screen to display the image, so trying to achieve a 1-to-1 pixel correspondence makes sense, because it should give the sharpest result possible to the average user.

However, sticking to that approach will likely introduce a cost. As I've just mentioned, we will probably need to generate more than one of the

high

...
buckets based on the original, in order to avoid resizing artifacts.

On the other hand, we could decide that the unified bucket list shouldn't be based on screen resolutions (after all the full width display scenario experienced in Media Viewer might be the exception, and the buckets will

be

...
for everything mediawiki) and instead would progress by powers of 2. Then creating a given bucket could always be done without resizing artifacts, based on the bucket above the current one. This should provide the

biggest

...
savings possible in image scaling time to generate thumbnail buckets.

To illustrate with an example, the bucket list could be: 256, 512, 1024, 2048, 4096. The 4096 bucket would be generated first, based on the original, then 2048 would be generated based on 4096, then 1024 based on 2048, etc.

The big downside is that there's less progression in the 1000-3000 range (4 buckets in the Media Viewer resolution-oriented strategy, 2 buckets here) where the majority of devices currently are. If I take a test image as an example ( https://commons.wikimedia.org/wiki/File:Swallow_flying_drinking.jpg),

the

...
file size progression is quite different between the screen resolution buckets and the geometric (powers of 2) buckets:

screen resolution buckets

320 11.7kb 640 17kb 800 37.9kb 1024 58kb 1280 89.5kb 1920 218.9kb 2560 324.6kb 2880 421.5kb

geometric buckets

256 9.4kb 512 20kb 1024 58kb 2048 253.1kb 4096 test image is smaller than 4096

It seems like it's not ideal that a screen resolution slightly above 1024 would suddenly need to download an image 5 times as heavy, for not that many extra pixels on the actual screen. A similar thing can be said for

the

...
screen resolution progression, where the file size more than doubles between 1280 and 1920. We could probably use at least an extra step

between

...
those two if we use screen resolution buckets, like 1366 and/or 1440.

I think that the issue of buckets between 1000 and 3000 is tricky, it's going to be difficult to avoid generating them based on the original

while

...
not getting visual artifacts.

Maybe we can get away with generating 1280 (and possibly 1366, 1440)

based

...
on 2048, the distance between the two guaranteeing that the quality

issues

...
will be negligible. We definitely can't generate a 1920 based on a 2048 thumbnail, though, otherwise artifacts on thin lines will look awful.

A mixed progression like this might be the best of both worlds, if we confirm that between 1024 and 2048 the resizing is artifact-free enough:

256, 512, 1024, 1280, 1366, 1440, 2048, 4096 where 2048 is generated

based

...
on 4096, 1024, 1280, 1366, 1440 are generated based on 2048, 512 based on 1024, 256 based on 512.

If for example the image width is between 1440 and 2048, then 1024, 1280, 1366, 1440 would be generated based on the original. That's fine performance-wise, since the original is small.

Something that might also be useful to generate is a thumbnail of the

same

...
size as the original if original < 4096 (or whatever the highest bucket is). Currently we seem to block generating such a thumbnail, but the difference in file size is huge. For the test image mentioned above,

which

...
is 3002 pixels wide, the original is 3.69MB, while a thumbnail of the

same

...
size would be 465kb. For the benefit of retina displays that are

2560/2880,

...
displaying a thumbnail of the same size as a 3002 original would

definitely

...
be better than the highest available bucket (2048).

All of this is benchmark-worthy anyway, I might be splitting hair looking for powers of two if rendering a bucket chain (each bucket generated

based

...
on the next one) isn't that much faster than generating all buckets based on the biggest bucket.

On Thu, May 1, 2014 at 12:54 PM, Gilles Dubuc <gilles@wikimedia.org wrote:

...
_don't consider the upload complete_ until those are done! a web

uploader

...
...
...
or API-using bot should probably wait until it's done before uploading

the

...
...
...
next file, for instance...

You got me a little confused at that point, are you talking about the client generating the intermediary sizes, or the server?

I think client-side thumbnail generation is risky when things start getting corrupt. A client-side bug could result in a user uploading thumbnails that are for a different image. And if you want to run a

visual

...
...
signature check on the server-side to avoid that issue, you might be looking at similar processing time checking that the thumbnail is for

the

...
...
correct image then if the server was to generate the actual thumbnail.

It

...
...
would be worth researching if there's a very fast "is this thumbnail a smaller version of that image" algorithm out there. We don't need 100% confidence either, if we're looking to avoid shuffling bugs in a given upload batch.

Regarding the issue of a single intermediary size versus multiple, there's still a near-future plan to have pregenerated buckets for Media Viewer (which can be reused for a whole host of other things). Those

could

...
...
be used like mip-maps like you describe. Since these sizes will be generated at upload time, why not use them?

However quality starts to introduce noticeable visual artifacts when the bucket (source image)'s dimensions are too close to the thumbnail you

want

...
...
to render.

Consider the existing Media Viewer width buckets: 320, 640, 800, 1024, 1280, 1920, 2560, 2880

I think that generating the 300px thumbnail based on the 320 bucket is likely to introduce very visible artifacts with thin lines, etc.

compared

...
...
to using the biggest bucket (2880px). Maybe there's a smart compromise, like picking the higher bucket (eg. 300px thumbnail would use the 640 bucket as its source, etc.). I think that we need a battery of visual

test

...
...
to determine what's the best strategy here.

All of this is dependent on Ops giving the green light for pregenerating the buckets, though. The swift capacity for it is slowing being brought online, but I think Ops' prerequisite wish to saying yes to it is that

we

...
...
focus on the post-swift strategy for thumbnails. We also need to figure

out

...
...
the performance impact of generating all these thumbnails on upload. On

a

...
...
very meta note, we might generate the smaller buckets based on the

biggest

...
...
bucket and only the 2-3 biggest buckets based on the original (still to avoid visual artifacts).

Another related angle I'd like to explore is to submit a simplified version of this RFC:

https://www.mediawiki.org/wiki/Requests_for_comment/Standardized_thumbnails_... propose a single bucket list option instead of multiple

...
...
(presumably, the Media Viewer ones, if not, we'd update Media Viewer to

use

...
...
the new canon list of buckets). And where we would still allow arbitrary thumbnail sizes below a certain limit. For example, people would still

be

...
...
allowed to request thumbnails that are smaller than 800px at any size

they

...
...
want, because these are likely to be thumbnails in the real sense of the term, and for anything above 800px they would be limited to the

available

...
...
buckets (eg. 1024, 1280, 1920, 2560, 2880). This would still allow foundation-hosted wikis to have flexible layout strategies with their thumbnail sizes, while reducing the craziness of this attack vector on

the

...
...
image Scalers and gigantic waste of disk and memory space on the

thumbnail

...
...
hosting. I think it would be an easier sell for the community, the

current

...
...
RFC is too extreme in banning all arbitrary sizes and offers too many bucketing options. I feel like the standardization of true thumbnail

sizes

...
...
(small images, <800px) is much more subject to endless debate with no consensus.

On Thu, May 1, 2014 at 12:21 PM, Erwin Dokter erwin@darcoury.nl

wrote:

...
...
...
On 04/30/2014 12:51 PM, Brion Vibber wrote:

...

at upload time, perform a series of scales to produce the mipmap

levels

_don't consider the upload complete_ until those are done! a web

uploader or API-using bot should probably wait until it's done before uploading the next file, for instance...

once upload is complete, keep on making user-facing thumbnails as

before... but make them from the smaller mipmap levels instead of the full-scale original

Would it not suffice to just produce *one* scaled down version (ie. 2048px) which the real-time scaler can use to produce the thumbs?

Regards,

Erwin Dokter

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

-- Best regards, Max Semenik ([[User:MaxSem]])

Erwin Dokter

5:47 p.m.

On 01-05-2014 16:57, Gilles Dubuc wrote:

...

And here's a side-by-side comparison of these images generated with chaining and images that come from our regular image scalers: https://dl.dropboxusercontent.com/u/109867/imagickchaining/index.html Try to guess which is which before inspecting the page for the answer :)

Not much difference, but it's there. Progressive scaling loses edge detail during each stage. The directly scaled images look sharper.

Regards,

-- Erwin Dokter

Gilles Dubuc

2 May 2 May

7:35 a.m.

Another round, using one of Rob's test images.

The directly scaled images look sharper.

...

I've now realized that our image scaler uses the -sharpen option in most cases (as long as the thumbnail is 0.85 times the size of the original or smaller, if I'm reading the code correctly). This time I applied -sharpen 0x0.8 to the right buckets in the chain (in this case every bucket except 4096) for a fairer comparison.

And this time instead of a side by side, there are two pages, so that you can see the difference better by switching between tabs:

https://dl.dropboxusercontent.com/u/109867/imagickchaining/2/a.html https://dl.dropboxusercontent.com/u/109867/imagickchaining/2/b.html

On Thu, May 1, 2014 at 7:47 PM, Erwin Dokter erwin@darcoury.nl wrote:

...

On 01-05-2014 16:57, Gilles Dubuc wrote:

...
And here's a side-by-side comparison of these images generated with chaining and images that come from our regular image scalers: https://dl.dropboxusercontent.com/u/109867/imagickchaining/index.html Try to guess which is which before inspecting the page for the answer :)

Not much difference, but it's there. Progressive scaling loses edge detail during each stage. The directly scaled images look sharper.

Regards,

Erwin Dokter

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Gergo Tisza

11:25 p.m.

On Thu, May 1, 2014 at 7:02 AM, Gilles Dubuc gilles@wikimedia.org wrote:

...

Another point about picking the "one true bucket list": currently Media Viewer's buckets have been picked based on the most common screen resolutions, because Media Viewer tries to always use the entire width of the screen to display the image, so trying to achieve a 1-to-1 pixel correspondence makes sense, because it should give the sharpest result possible to the average user.

I'm not sure the current size list is particularly useful for MediaViewer, since we are fitting images into the screen, and the huge majority of images are constrained by height, so the width of the image on the screen will be completely unrelated to the width bucket size. Having common screen sizes as width buckets would be useful if we would be filling instead of fitting (something that might make sense for paged media).

------

I wonder if the mip-mapping approach could somehow be combined with tiles? If we want proper zooming for large images, we will have to split them up into tiles of various sizes, and serve only the tiles for the visible portion when the user zooms on a small section of the image. Splitting up an image is a fast operation, so maybe it could be done on the fly (with caching for a small subset based on traffic), in which case having a chain of scaled versions of the image would take care of the zooming use case as well.

Gilles Dubuc

5 May 5 May

9:04 a.m.

...

Buttons is French: Suiv. -> Make it English

That's a bug in SurveyMonkey, the buttons are in French because I was using the French version of the site at the time the survey was created, and now that text on those buttons can't be fixed. I'll make sure to switch SurveyMoney to English before creating the next one.

No "swap" or "overlay" function for being able to compare

...

SurveyMonkey is quite limited, it can't do that, unfortunately. The alternative would be to build my own survey from scratch, but that would be require a lot of resources for little benefit. This is really a one-off need.

...

I wonder if the mip-mapping approach could somehow be combined with tiles? If we want proper zooming for large images, we will have to split them up into tiles of various sizes, and serve only the tiles for the visible portion when the user zooms on a small section of the image. Splitting up an image is a fast operation, so maybe it could be done on the fly (with caching for a small subset based on traffic), in which case having a chain of scaled versions of the image would take care of the zooming use case as well.

Yes we could definitely have the reference thumbnail sizes be split up on the fly to generate tiles, when we get around to implementing proper zooming. It's as simple as making Varnish cache the tiles and the php backend generate them on the fly by splitting the reference thumbnails.

Regarding the survey I ran on wikitech-l, so far there are 26 respondents. It seems that on the images with a lot of edges (the test images provided by Rob) at least 30% of people can tell the difference in terms of quality/sharpness. On regular images people can't really tell. Thus, I wouldn't venture to do the full chaining, as a third of visitors will be able to tell that there's a quality degradation. I'll run another survey later in the week where instead of full chaining all the thumbs are generated based on the biggest thumb.

On Sat, May 3, 2014 at 1:25 AM, Gergo Tisza gtisza@wikimedia.org wrote:

...

On Thu, May 1, 2014 at 7:02 AM, Gilles Dubuc gilles@wikimedia.org wrote:

...
Another point about picking the "one true bucket list": currently Media Viewer's buckets have been picked based on the most common screen resolutions, because Media Viewer tries to always use the entire width of the screen to display the image, so trying to achieve a 1-to-1 pixel correspondence makes sense, because it should give the sharpest result possible to the average user.

I'm not sure the current size list is particularly useful for MediaViewer, since we are fitting images into the screen, and the huge majority of images are constrained by height, so the width of the image on the screen will be completely unrelated to the width bucket size. Having common screen sizes as width buckets would be useful if we would be filling instead of fitting (something that might make sense for paged media).

I wonder if the mip-mapping approach could somehow be combined with tiles? If we want proper zooming for large images, we will have to split them up into tiles of various sizes, and serve only the tiles for the visible portion when the user zooms on a small section of the image. Splitting up an image is a fast operation, so maybe it could be done on the fly (with caching for a small subset based on traffic), in which case having a chain of scaled versions of the image would take care of the zooming use case as well. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Gilles Dubuc

9 May 9 May

8:59 a.m.

After taking a closer look at what commands run exactly in production, it turns out that I probably applied the IM parameters in the wrong order when I put together the survey (order matters, particularly for sharpening). I'll regenerate the images and make another (hopefully better) survey that will compare the status quo, chained thumbnails and single thumbnail reference.

On Mon, May 5, 2014 at 11:04 AM, Gilles Dubuc gilles@wikimedia.org wrote:

...

Buttons is French: Suiv. -> Make it English

...
That's a bug in SurveyMonkey, the buttons are in French because I was using the French version of the site at the time the survey was created, and now that text on those buttons can't be fixed. I'll make sure to switch SurveyMoney to English before creating the next one.

No "swap" or "overlay" function for being able to compare

...
SurveyMonkey is quite limited, it can't do that, unfortunately. The alternative would be to build my own survey from scratch, but that would be require a lot of resources for little benefit. This is really a one-off need.

...
I wonder if the mip-mapping approach could somehow be combined with tiles? If we want proper zooming for large images, we will have to split them up into tiles of various sizes, and serve only the tiles for the visible portion when the user zooms on a small section of the image. Splitting up an image is a fast operation, so maybe it could be done on the fly (with caching for a small subset based on traffic), in which case having a chain of scaled versions of the image would take care of the zooming use case as well.

Yes we could definitely have the reference thumbnail sizes be split up on the fly to generate tiles, when we get around to implementing proper zooming. It's as simple as making Varnish cache the tiles and the php backend generate them on the fly by splitting the reference thumbnails.

Regarding the survey I ran on wikitech-l, so far there are 26 respondents. It seems that on the images with a lot of edges (the test images provided by Rob) at least 30% of people can tell the difference in terms of quality/sharpness. On regular images people can't really tell. Thus, I wouldn't venture to do the full chaining, as a third of visitors will be able to tell that there's a quality degradation. I'll run another survey later in the week where instead of full chaining all the thumbs are generated based on the biggest thumb.

On Sat, May 3, 2014 at 1:25 AM, Gergo Tisza gtisza@wikimedia.org wrote:

...
On Thu, May 1, 2014 at 7:02 AM, Gilles Dubuc gilles@wikimedia.org wrote:

...
Another point about picking the "one true bucket list": currently Media Viewer's buckets have been picked based on the most common screen resolutions, because Media Viewer tries to always use the entire width

of

...
the screen to display the image, so trying to achieve a 1-to-1 pixel correspondence makes sense, because it should give the sharpest result possible to the average user.

I'm not sure the current size list is particularly useful for MediaViewer, since we are fitting images into the screen, and the huge majority of images are constrained by height, so the width of the image on the screen will be completely unrelated to the width bucket size. Having common screen sizes as width buckets would be useful if we would be filling instead of fitting (something that might make sense for paged media).

I wonder if the mip-mapping approach could somehow be combined with tiles? If we want proper zooming for large images, we will have to split them up into tiles of various sizes, and serve only the tiles for the visible portion when the user zooms on a small section of the image. Splitting up an image is a fast operation, so maybe it could be done on the fly (with caching for a small subset based on traffic), in which case having a chain of scaled versions of the image would take care of the zooming use case as well. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Gilles Dubuc

13 May 13 May

1:26 p.m.

In a surprising turn of events, the latest survey https://www.surveymonkey.com/s/F6CGPDJ shows that people consistently prefer the chained thumbnails (each thumbnail generated based on the next bigger thumbnail) to the ones we currently generate from the original and to thumbnails generated all based on the largest thumbnail. Both in terms of sharpness (not surprising, since there are more passes of sharpening due to the chaining) and in terms of quality. I suspect that this is due to sharpening being more pronounced visually.

JHeald on Commons village pump also brought up the fact that the resizing we currently do with ImageMagick's -thumbnail introduces artifacts on some images, which I've verified: * using -thumbnail https://dl.dropboxusercontent.com/u/109867/imagickchaining/sharpening/435-sh... * using -resize https://dl.dropboxusercontent.com/u/109867/imagickchaining/sharpening/435-sh... (I found out about that after the survey, for which all images have been generated using the status quo -thumbnail option)

I'm pretty sure that we are using -thumbnail because it advertises itself as being faster for large images. However with the testing I've done on large images, it seems like if we were chaining thumbnail generation, the performance gains would be so large that we could afford to use -resize and avoid those artifacts, while still generating thumbnails much faster than we currently do.

In conclusion, it seems safe to implement chaining where we maintain a set of reference thumbnail, each generated based on the bigger one. Image quality isn't impacted negatively by doing that, according to the survey. And we would be able to use -resize, which would save us from artifacts and improve image quality. Unless anyone objects, the Multimedia team can start working on that change. I consider that the idea of generating those reference thumbnails at upload time before the file is considered uploaded to be as separate task, which we're also exploring at the moment.

On Fri, May 9, 2014 at 10:59 AM, Gilles Dubuc gilles@wikimedia.org wrote:

...

After taking a closer look at what commands run exactly in production, it turns out that I probably applied the IM parameters in the wrong order when I put together the survey (order matters, particularly for sharpening). I'll regenerate the images and make another (hopefully better) survey that will compare the status quo, chained thumbnails and single thumbnail reference.

On Mon, May 5, 2014 at 11:04 AM, Gilles Dubuc gilles@wikimedia.orgwrote:

...
Buttons is French: Suiv. -> Make it English

...
That's a bug in SurveyMonkey, the buttons are in French because I was using the French version of the site at the time the survey was created, and now that text on those buttons can't be fixed. I'll make sure to switch SurveyMoney to English before creating the next one.

No "swap" or "overlay" function for being able to compare

...
SurveyMonkey is quite limited, it can't do that, unfortunately. The alternative would be to build my own survey from scratch, but that would be require a lot of resources for little benefit. This is really a one-off need.

...
I wonder if the mip-mapping approach could somehow be combined with tiles? If we want proper zooming for large images, we will have to split them up into tiles of various sizes, and serve only the tiles for the visible portion when the user zooms on a small section of the image. Splitting up an image is a fast operation, so maybe it could be done on the fly (with caching for a small subset based on traffic), in which case having a chain of scaled versions of the image would take care of the zooming use case as well.

Yes we could definitely have the reference thumbnail sizes be split up on the fly to generate tiles, when we get around to implementing proper zooming. It's as simple as making Varnish cache the tiles and the php backend generate them on the fly by splitting the reference thumbnails.

Regarding the survey I ran on wikitech-l, so far there are 26 respondents. It seems that on the images with a lot of edges (the test images provided by Rob) at least 30% of people can tell the difference in terms of quality/sharpness. On regular images people can't really tell. Thus, I wouldn't venture to do the full chaining, as a third of visitors will be able to tell that there's a quality degradation. I'll run another survey later in the week where instead of full chaining all the thumbs are generated based on the biggest thumb.

On Sat, May 3, 2014 at 1:25 AM, Gergo Tisza gtisza@wikimedia.org wrote:

...
On Thu, May 1, 2014 at 7:02 AM, Gilles Dubuc gilles@wikimedia.org wrote:

...
Another point about picking the "one true bucket list": currently Media Viewer's buckets have been picked based on the most common screen resolutions, because Media Viewer tries to always use the entire width

of

...
the screen to display the image, so trying to achieve a 1-to-1 pixel correspondence makes sense, because it should give the sharpest result possible to the average user.

I'm not sure the current size list is particularly useful for MediaViewer, since we are fitting images into the screen, and the huge majority of images are constrained by height, so the width of the image on the screen will be completely unrelated to the width bucket size. Having common screen sizes as width buckets would be useful if we would be filling instead of fitting (something that might make sense for paged media).

I wonder if the mip-mapping approach could somehow be combined with tiles? If we want proper zooming for large images, we will have to split them up into tiles of various sizes, and serve only the tiles for the visible portion when the user zooms on a small section of the image. Splitting up an image is a fast operation, so maybe it could be done on the fly (with caching for a small subset based on traffic), in which case having a chain of scaled versions of the image would take care of the zooming use case as well. _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Brion Vibber

1 May 1 May

4:09 p.m.

On Thu, May 1, 2014 at 3:54 AM, Gilles Dubuc gilles@wikimedia.org wrote:

...

...
_don't consider the upload complete_ until those are done! a web uploader or API-using bot should probably wait until it's done before uploading

the

...
next file, for instance...

You got me a little confused at that point, are you talking about the client generating the intermediary sizes, or the server?

Server.

-- brion

3888

Age (days ago)

3901

Last active (days ago)

wikitech-l@lists.wikimedia.org

14 comments

7 participants

tags (0)

participants (7)

Brion Vibber
Erwin Dokter
Gabriel Wicke
Gergo Tisza
Gilles Dubuc
Max Semenik
Rob Lanphier