On Thu, May 1, 2014 at 8:37 AM, Fabrice Florin <fflorin@wikimedia.org> wrote:
* how fast do images load? 
I think we now have enough data to make a statement, both on a global and local basis, with breakdowns for warm and cold caches, as well as large images. Want to take a first stab at it, or want me to?

It seems that a lot of data is missing after April 27, but that probably won't affect averages much, so we can answer this one.

For image hits (i.e. when no thumbnail rendering is needed) an average image request takes 0.4 second. 10% of the users need to wait more than 2 sec, and 1% need to wait longer than 10 sec.
For image misses, the average is 1.5 second, 10% of the users need to wait more than 5 sec. In the slowest 1% there is much larger variation than for hits, but they usually have to wait more than 20 sec.

We do not have a breakdown per image size, although we do have the data required to do it.
 
* does performance improve with more users?
It would be great if we could confirm our hypothesis that images load faster when more users are clicking on them.

The miss ratio has been climbing down steadily. We don't have a chart showing that (we probably should), but here is a Google Charts chart showing the ratio of misses within all requests in April - it started at 80%, and seems to have stabilized now a little above 20%.

Accordingly, the mean image load time is down from 1+ sec to 0.5 sec, the time for the 90th percentile is down from 4.5 sec to 3 sec, the 99th percentile is around 15 sec (there is no old number as there wasn't enough data before enabling on plwiki so the variation was huge).

* any slowdowns during peak hours? 
Do we have any data that would show what happens during peak hours? Would this require us to create hourly dashboards in a few pilot sites?

We only have daily graphs currently. We do have the data, but we would have to write new queries.