Hi Matt,
On Fri, Oct 18, 2013 at 07:01:40PM -0700, Matthew Walker wrote:
tldr; Do we have data on the number of compressed vs. uncompressed requests we serve?
Hoping that others chime in on this, as I could not find such data.
As you suggested in IRC that you could use the output of
and compare sizes to determine whether or not the response was compressed or not, I did some random checks:
* Comparing existing logs with Content-Lengths for Content-Encoding gzip and unencoded responses, field 7 of [1] indeed typically matches either of those lengths. So it does not hold the uncompressed length. I updated [1] accordingly.
* For a few random pages I checked our existing logs, and it seems that for <5% requests field 7 of [1] matches the uncompressed length. For >90% it matches the gzipped length. I know it's completely unrepresentative, but I hope it helps for feasibility/order of magnitude computations.
Best regards, Christian
[1] https://wikitech.wikimedia.org/wiki/Cache_log_format