Commenting on a bunch of things...

On Tue, Mar 18, 2014 at 12:15 PM, Gilles Dubuc <gilles@wikimedia.org> wrote:
I think the only measure of success we can do in our realm is how opening an image in media viewer compares to opening a file page or not. We're not tracking that yet. The only way we could do that on the user's end I can think of is to load a file page in an invisible iframe and measure how long it takes for it to load, and better yet how long it takes for the image on that file page to load too. And compare that to an image load in the media viewer. However it's really challenging to measure that, because we can't stop the user from navigating images in the media viewer while we attempt to measure a file page in an iframe, and the navigating they do would trigger requests that use up bandwidth, etc. Thus, I don't think we can get pertinent figures collected directly from users that will tell us if media viewer is doing a good job in terms of performance or not, because there would be too much noise in the data collection.

I agree that measuring MediaViewer performance and conventional performance in the same request is not realistic. What we could do is measure MediaViewer performance for those who use it, measure file description page performance for those who don't (or maybe do but still open description pages from time to time) and compare averages. That's still less then ideal (for one, having a slow or fast connection might be a motivating factor to enable MediaViewer which would bias the results; moreover,  workflows for using the viewer and description pages might have different in ways that skew the results the background (e.g. opening dozens of image description pages in tabs)), but relatively easy to do - I think page load time stats are already collected so it's just a question of crafting the right queries.

- What screen resolution are we testing against? The bigger the resolution, the bigger the image, the slower the image load. I couldn't find any figures about the average desktop screen resolution of people visiting our wikis. Maybe someone knows where to get that figure if we have it? On that front we could either test the performance of the most common resolutions, or test the performance of the average resolution.

Just bumping this in case people in the know do not bother to read back all the quoted messages :) I thought we had screen size stats on stats.wikimedia.org but it seems we do not.

And knowing that there's literally nothing we can do about [varnish cache misses] at this point, besides reducing the amount of buckets to reduce the likelihood of being the first person to hit one.
 
I think there is one more thing we could do if this problem turns out to be really bad, that would be aggressive preloading without actually loading the images. We could fire off requests for all the thumbnails we need, then abort those requests almost immediately. That way the bandwidth waste would be minimal, but the thumbnails would be rendered by the time we request them for real. Not very nice on the servers, but might be worth considering as a last resort.

On the topic of measuring detailed things from an end-user perspective (time it takes to display the blurred thumb, to hit next, etc.) I think that they're too complex to be worth doing at the moment, and we have nothing to compare them against. For example, the amount of time it takes to display the blurred image has no equivalent on the file page, so that figure can't really be used to determine success. Graphs of those figures expressed in user-centric terms would be easier to understand to outsiders, but in terms of troubleshooting technical issues they're not better than the data we're already collecting. They're worse, in fact, because any number of things could happen on the users' computers between action A and action B (browsers freezing tabs comes to mind) that would quickly render a lot of those virtual user-centric figures meaningless. I think we should focus on what makes the core experience better, not spending time building entertaining graphs.

As far as I can see this would be extremely simple to do, just log the time difference between the thumbnail click and the image load event. I agree these numbers would not be helpful for technical tasks and that we do not have a good baseline to compare them against. I think even so they would be instructive. (This goes for time to first display only. Other wait times such as prev/next depend on circumstances too much to be of any use; basically we would get random results depending on whether the user spent enough time on the previous image to allow preloading to finish.)

Time from click to display of the real image would be a real metric (in the sense that we would actually measure the time passed). We can try to reconstruct that as an artificial metric, by adding together various request times, but that is less reliable and fragile when the architecture changes. We might decide, for example, to join all the API requests we do to fetch the image data into a single one (I doubt it would help, but it is a possibility). The request performance data before and after that would be hard to compare, but time-to-display stats would just work, whatever architecture changes happen in the background.

On Mar 18, 2014, at 8:08 AM, Gilles Dubuc <gilles@wikimedia.org> wrote:
I think a more pertinent question than comparing timing would be to measure how many times people open the full resolution image with and without Media Viewer. This is more of a general visit statistic to follow. I imagine that the question you want to answer here is whether people feel less the need to open the full resolution image, thanks to the extra detail visible on the screen with media viewer. I'm not sure that we can measure that, though, because media viewer and the file page can both access the same full resolution image. I.e. a CDN hit on the full res doesn't mean the person opened the full resolution specifically, it could be that media viewer displayed it because the person's screen resolution required it, or that someone gave them a link to the full resolution image, etc. It's an interesting question, but at a glance measuring it seems quite difficult.

Given we set CORS headers on our file requests, it should be possible to tell them apart to an extent. (Since we do not do that all the time, we would have to filter some browsers from the requests; sounds painful but doable.)
It does not help with differentiating against other AJAX-based tools; if we want to do that, we could set a custom header (X-Originated-By: MediaViewer) and make sure it gets logged. Might be more work than it is worth...

Another possibility would be to measure if file page usage drops, although this is complicated by the fact that MediaViewer uses the file page in share links.

On Tue, Mar 18, 2014 at 10:28 AM, Gilles Dubuc <gilles@wikimedia.org> wrote:
Aren't all pilot sites hosted in the same place? I think tracking them on the detailed network performance level, as we already do, is sufficient to spot issues in the ops realm where one site would get unusually slower compared to the others. We can check to be certain whether or not Media Viewer vs File page gives us different results between the sites, but if they're all hosted in the same place, they should give us the same results. If I'm wrong about hosting location, then yes, we should definitely track them.

There are some subtle differences: they are in different database clusters (assuming we go with the currently planned set of pilots - plwiki is in s2, cawiki, huwiki and kowiki in s7, enwikivoyage in s3 [1]) so load statistics might be different (this does not affect image loading time, but might affect the APIs). Also, there are varnish clusters in Europe and Asia, and a file page can come entirely from Varnish, but MediaViewer uses API calls which are never cached (I think?) so it might be at a larger disadvantage on e.g. the Korean Wikipedia.

More importantly, the two main bottlenecks we have are probably image loading time and image rendering time, and the proportion of those depends on geography, so it might be very different for the Korean Wikipedia and the European ones. (Not sure in which direction though, since both affect MediaViewer more than the file description page: rendering because the description page is practically never a varnish miss, and bandwidth because we show larger images.)

And if time allows:
- add the thumb dimensions to the markup in core, so that we can display the blurred thumbnail 200-300ms sooner on average (helps with perception)

One thing we could do in addition that is to replicate the thumbnail URL generation logic on the JS side. This would allow us to request the thumbnail as soon as the user clicks, without having to make a thumbnailinfo call. Probably not much of a difference, but given that the thumbnail comes from varnish but the API response comes from the PHP servers, it might be significant in some countries.