On Mon, Jul 18, 2016 at 5:40 PM, Gergo Tisza <gtisza(a)wikimedia.org> wrote:
On Fri, Jul 15, 2016 at 10:25 AM, Bartosz Dziewoński
<
bdziewonski(a)wikimedia.org> wrote:
On Fri, 15 Jul 2016 08:35:42 +0200, Pau Giner
<pginer(a)wikimedia.org>
wrote:
I thought it could be of interest:
Interesting. I'm not sure if our upload backend people are on this list
(mostly Aaron and Filippo take care of it). You might want to forward it to
them.
I wonder if the disk space savings for us would be worth the engineering
time. It probably was for Dropbox (if the 22% reduction saves them
"multiple petabytes of space"), but we "only" have terabytes of JPG
images
(most of them on Commons:
https://commons.wikimedia.org/wiki/Special:MediaStatistics), and hard
disks are not that expensive. I'm not really qualified to judge, though :)
How would that work? Most of Dropbox's traffic is probably between their
servers and their (desktop or mobile) clients, so they can just convert
to/from JPEG on both endpoints. For Wikimedia, ~99% of the traffic is sent
to a web browser which does not support Lepton.
We could compress upon writing and decompress upon reading (in swift that
is, no varnish involved). As Bartosz mentioned though at our sizes I don't
think it would be worth the engineering time. Commons thumbs+originals is
roughly 160T, so 20% is 32T or IOW we would "save" roughly a machine's
worth of 3TB disks (3x if including replication) and spending CPU to
compress/decompress.
HTH,
filippo