On Mon, Jul 18, 2016 at 5:40 PM, Gergo Tisza <gtisza@wikimedia.org> wrote:
On Fri, Jul 15, 2016 at 10:25 AM, Bartosz DziewoƄski <bdziewonski@wikimedia.org> wrote:
On Fri, 15 Jul 2016 08:35:42 +0200, Pau Giner <pginer@wikimedia.org> wrote:

I thought it could be of interest:
https://blogs.dropbox.com/tech/2016/07/lepton-image-compression-saving-22-losslessly-from-images-at-15mbs/

Interesting. I'm not sure if our upload backend people are on this list (mostly Aaron and Filippo take care of it). You might want to forward it to them.

I wonder if the disk space savings for us would be worth the engineering time. It probably was for Dropbox (if the 22% reduction saves them "multiple petabytes of space"), but we "only" have terabytes of JPG images (most of them on Commons: https://commons.wikimedia.org/wiki/Special:MediaStatistics), and hard disks are not that expensive. I'm not really qualified to judge, though :)

How would that work? Most of Dropbox's traffic is probably between their servers and their (desktop or mobile) clients, so they can just convert to/from JPEG on both endpoints. For Wikimedia, ~99% of the traffic is sent to a web browser which does not support Lepton.

We could compress upon writing and decompress upon reading (in swift that is, no varnish involved). As Bartosz mentioned though at our sizes I don't think it would be worth the engineering time. Commons thumbs+originals is roughly 160T, so 20% is 32T or IOW we would "save" roughly a machine's worth of 3TB disks (3x if including replication) and spending CPU to compress/decompress.

HTH,
filippo