(time it takes to display the blurred thumb, to
hit next, etc.) I think
that they're too complex to be worth doing at the moment, and we have
nothing to compare them against.
I agree that "perceived performance" as the name suggest is subjective,
but having a measure that helps us to get an idea of whether efforts in
that front are working is useful. The time the user waits until seeing some
kind of progress (e.g, getting the blurry image) is useful in that context.
For example, the amount of time it takes to display the blurred image has
no equivalent on the file page, so that figure
can't really be used to
determine success.
For the file page it is true that the perceived performance will
approximate the real one.
Do we include also the access to the file (that some users do to view it
in more details) as part of the things we compare with Media Viewer? If
that is the case, big
that
are progressively loaded by browsers would be more comparable, and it will
be interesting to check both times (showing something vs. showing the
complete image) in MediaViewer and outside of it.
Another interesting time to take into account, is the time saved through
navigation controls. Having an idea of how much such feature is used could
provide us an estimate of the time saved by the user (which currently has
to go back and forth dealing with additional page loads or tab switching in
the browser).
Having said all that, I totally understand that measuring the real
performance is considered more priority than the perceived performance, but
measures to estimate the later should not be overlooked.
Pau
On Tue, Mar 18, 2014 at 12:15 PM, Gilles Dubuc <gilles(a)wikimedia.org>wrote;wrote:
Goal: The Media Viewer would be considered
accepted if it can display
1-2 Mb images in less than 3 seconds at least 80%
of the time, during the
course of a week.
The issue with that goal is that it's performance is almost entirely out
of our hands. As seen in my preliminary analysis of the data (
https://www.mediawiki.org/wiki/Multimedia/Performance_Analysis ), some
places like Russia seem to have terrible performance compared to average
internet speed in those countries and there's nothing we (multimedia team)
can do to change that. Chances are, if media viewer can't display a 1-2MB
image in less than 3 seconds 80% of the time, neither can the file page or
any wiki page with an image of the same size on it. Because the issue is
most likely the connectivity between users in those countries and our
servers, not the technique used to deliver it (as part of the pageload or
loaded by JS).
I think that the main issue with the goal options I've seen so far is
that they focus on general performance of Media Viewer as an isolated
entity. The network performance tracking we've set up is good to identify
issues on our end. For example an API call that might be too slow and that
maybe we can optimize, or the fact that we could patch mediawiki core by
adding thumb dimensions to display the thumb sooner. Also, it helps us keep
track of any ops issues that might be affecting the product we're
responsible for (media viewer) on an ongoing basis. They help making sure
that we're doing the most we can to make things fast.
What these network performance stats aren't good for, though, is to
determine whether media viewer is successful as a product. Because the
performance of our servers, our CDNs and our networking infrastructure are
all bundled up in the same figure, indistinguishable from one another. It
doesn't tell us if Media Viewer is good in the context of an infrastructure
that won't change overnight.
I think the only measure of success we can do in our realm is how
opening an image in media viewer compares to opening a file page or not.
We're not tracking that yet. The only way we could do that on the user's
end I can think of is to load a file page in an invisible iframe and
measure how long it takes for it to load, and better yet how long it takes
for the image on that file page to load too. And compare that to an image
load in the media viewer. However it's really challenging to measure that,
because we can't stop the user from navigating images in the media viewer
while we attempt to measure a file page in an iframe, and the navigating
they do would trigger requests that use up bandwidth, etc. Thus, I don't
think we can get pertinent figures collected directly from users that will
tell us if media viewer is doing a good job in terms of performance or not,
because there would be too much noise in the data collection.
I think that automated testing is the way to go, we should package this
performance measurement (media viewer vs file page) as a series of browser
tests and check the figures that way. Even better if they can run on
something like cloudbees where there would be some latency between where
the tests run and our servers. Now, there are variables at play when making
a media viewer/file page comparison:
- Is the JS already cached? As Gergo mentioned, the JS being uncached
will happen the first time and then every 30 days-ish or whenever we update
media viewer (once a week at most, usually). I think we should measure both
variants (with JS cached and with JS not cached), to assess how bad the
effect of cold cache is. There are a number of ways we could address this
issue, some more aggressive than others in terms of bandwidth (eg. preload
the JS when the mouse cursor gets near a thumbnail, preload the JS after
the pageload is done, etc.). This is worth measuring because it's
actionable. The reason why we haven't taken those measures yet is that
they're a balancing act (wasting people's bandwidth vs providing a faster
experience).
- What screen resolution are we testing against? The bigger the
resolution, the bigger the image, the slower the image load. I couldn't
find any figures about the average desktop screen resolution of people
visiting our wikis. Maybe someone knows where to get that figure if we have
it? On that front we could either test the performance of the most common
resolutions, or test the performance of the average resolution.
- Varnish cache hit/varnish cache miss. We know that's a big slowdown
when it happens, and we know that this won't get solved for another few
months. That variable, however, also applies to file pages. The image on
the file page is a thumb too and it can be a varnish miss as well. We don't
see it often because it stops as soon as one person (usually the author)
visits the file page. Media viewer just increases the probability of
hitting a varnish cache miss because we have a few buckets instead of a
single size/bucket for the file page. I think this is an isolated problem
and actually one that needs more serious math to measure the effect of. Why
more serious math? Because for one, it depends on the distribution of
desktop resolutions among our visitors, compared to the buckets we've
picked. If for example a given bucket size covers 80% of our visitors, then
in 80% of the cases, the effect of varnish misses is exactly the same as
the file page. We also have to consider if it's worth spending time
studying this issue at all, knowing that a few months from now ops will
have the disk capacity that will allow us to pregenerate the bucket sizes
we need. And knowing that there's literally nothing we can do about it at
this point, besides reducing the amount of buckets to reduce the likelihood
of being the first person to hit one. My recommendation for that issue is
that we use the technical performance data we're collecting already to
determine what percentage of image views are affected by it over time on
wikis that have signed up for the launch. Then we'll get an idea of how bad
it really is on a wiki where everyone has media viewer (because, by network
effect, the more people there are, the less likely you will be to be the
first person to use media viewer on a given file). But it's not worth
obsessing over right now, because the low traffic of the tests sites makes
it happen to us a whole lot more than it would in a context where every
visitor has media viewer.
So, once we've settled what we do with the above variables, we can come
up with acceptance criteria for media viewer's performance, which could
look like:
- with a cold JS cache, on an average desktop resolution, with a varnish
hit, media viewer shows the image in at most 100% of the time it takes for
the file page to do the same
- with a warm JS cache, on an average desktop resolution, with a varnish
hit, media viewer shows the image in at most 75% of the time it takes for
the file page to do the same
- with a warm JS cache, on an large desktop resolution, with a varnish
hit, media viewer shows the image in at most 120% of the time it takes for
the file page to do the same
An added advantage to making this measurement automated is that it can
be baked in as a test failure/success criteria. So if suddenly we make a
code change that mistakenly makes the experience slower than our criteria,
the team would be notified automatically.
On the topic of measuring detailed things from an end-user perspective
(time it takes to display the blurred thumb, to hit next, etc.) I think
that they're too complex to be worth doing at the moment, and we have
nothing to compare them against. For example, the amount of time it takes
to display the blurred image has no equivalent on the file page, so that
figure can't really be used to determine success. Graphs of those figures
expressed in user-centric terms would be easier to understand to outsiders,
but in terms of troubleshooting technical issues they're not better than
the data we're already collecting. They're worse, in fact, because any
number of things could happen on the users' computers between action A and
action B (browsers freezing tabs comes to mind) that would quickly render a
lot of those virtual user-centric figures meaningless. I think we should
focus on what makes the core experience better, not spending time building
entertaining graphs.
On Tue, Mar 18, 2014 at 1:00 AM, Fabrice Florin <fflorin(a)wikimedia.org>wrote;wrote:
> Hi Multimedia team (keeping it to a short list so we can reach closure
> soon on this important topic):
>
> Did you have any comments on my email of Friday on the Image Load
> Study? (see below) That proposal was based on last week's conversations
> with you guys.
>
> If this general direction works for you, I propose the following main
> acceptance criteria from a performance standpoint:
>
> Goal: The Media Viewer would be considered accepted if it can display
1-2 Mb images in less than 3 seconds at least 80%
of the time, during the
course of a week.
> Verification: This goal could be verified with a histogram showing
> total load events in a week for 1-2 Mb images, with these deciles: number
> of image load events in under 1 second? in 1-2 seconds? in 2-3 seconds? ...
> and so on, up to 10 seconds or more. If 80% of these events take place in
> the first three deciles, we would have reached our goal.
>
> Would this seem like a reasonable basic measure of success for us in
> coming weeks? Or would you recommend another goal?
>
> If we had more time, we could track a variety of other goals, but I am
> looking for a single metric we can focus on and actually measure in time
> for launch. If we want more granular criteria, I proposed other possible
> performance targets by image size in card #149.
>
> On the assumption that this is a good direction to pursue, I propose we
> focus on the following 4 high priority cards for our next steps:
>
> #149 Define acceptance performance criteria for the media viewer (see
> above, let's edit as needed to reflect our team goal)
>
https://wikimedia.mingle.thoughtworks.com/projects/multimedia/cards/149
>
> #364 Instrumentation for timing of image load, lightbox UI load
>
https://wikimedia.mingle.thoughtworks.com/projects/multimedia/cards/364
>
> #292 Histograms and decile charts for performance
>
https://wikimedia.mingle.thoughtworks.com/projects/multimedia/cards/292
>
> #198 Analyze Image Load Data with Dashboards
>
https://wikimedia.mingle.thoughtworks.com/projects/multimedia/cards/198/
>
> I also created this Metrics Tasks Wall, based on Gergo's Epic
> Story:#359, to make it easier to track all these tickets:
>
>
https://wikimedia.mingle.thoughtworks.com/projects/multimedia/cards?favorit…
>
> Given our primary goal proposed above, I would recommend that we
> prioritize #364 and #292 over #198 -- and postpone the bandwith-related
> tickets, as recommended in the P.S. below.
>
> Please let me know what you think and what you recommend for our next
> steps.
>
> Thanks,
>
>
> Fabrice
>
>
> P.S.: For now, I recommend that we de-emphasize these bandwith-related
> metrics, since they are unlikely to happen in our time-frame:
>
> #361 Collect bandwidth stats
>
https://wikimedia.mingle.thoughtworks.com/projects/multimedia/cards/361
>
> #340 More Image Load Dashboards by Bandwidth
>
https://wikimedia.mingle.thoughtworks.com/projects/multimedia/cards/340
>
>
> On Mar 14, 2014, at 4:13 PM, Fabrice Florin <fflorin(a)wikimedia.org>
> wrote:
>
> Hi everyone,
>
> We would appreciate your advice on our upcoming research study of image
> load times on Media Viewer.
>
> Here are proposed goals, questions and outcomes for this study. They
> are presented for discussion purposes, not as a prescriptive requirement -
> and will be adjusted based on your feedback.
>
>
> *I. Goals*
> The goal of this study is to determine whether or not Media Viewer is
> loading images fast enough for the majority of our users in most common
> situations.
>
> As a typical user of the Media Viewer, I want images to load quickly,
> in just a few seconds, so I don't have to wait a long time to see them.
>
> Here are our recommended performance targets for image load times by
> connection speed, to match user expectations on the Web:
> * 1-2 seconds for a medium-size image on a fast connection
> * 2-3 seconds for the same image on a medium connection
> * 5-8 seconds for the same image a slow connection
>
> If tracking connection speeds is too hard in our time-frame, we could
> base our performance targets on image size instead. For example:
> * 1-2 seconds for a small-size image on a medium connection
> * 2-3 seconds for medium-size image on the same connection
> * 5-8 seconds for large-size image on the same connection
>
> Definitions:
> * Image load time = the number of seconds from when you click on a
> thumbnail to when you see the full image
> * Image size: large = over 2Mb, medium = 1 to 2Mb, small = under 1Mb
> * Connection speed: fast = over 256 Kbs, medium = 64 to 256 Kbs, slow =
> under 64 Kbs
>
> The above numbers are for discussion purposes, and can be adjusted
> based on your feedback.
>
>
> *II. Questions*
> Here are the main research questions we propose to answer about image
> load performance.
>
> *1. How long does it take for an image to load for the conditions
> below?*
> (image load = total time from thumbnail click to full image display)
>
> a. by image size:
> load times for large images? medium images? small images?
>
> b. by web site:
> load times for mediawiki.org? commons? enwiki? frwiki? huwiki?
> other sites?
>
> c. by connection speed: (optional)
> load times for fast connections? medium connections?
> small connections? (this may not be feasible in our time frame)
>
> d. by daypart: (optional)
> load times for morning? afternoon? evening? night time? (to show if
> performance slows during peak hours)
>
> This question could be answered by storing the timestamp for thumbnail
> clicks, as well as the timestamp for the full image display, then log the
> difference.
>
> We would then prepare different bar graphs for each condition set
> above, with categories on the vertical axis, and number of seconds on the
> horizontal axis. The graphs could be based on data from the last 7 days.
>
>
> *2. How often does the image load time exceed our performance targets
> above?*
>
> a. by load time in a day:
> number of images that load in under 1 second? in 1-2 seconds? in
> 2-3 seconds? ... and so on, up to 10 seconds or more
>
> b. by load time in a week:
> number of images that load in under 1 second? in 1-2 seconds? in
> 2-3 seconds? ... and so on, up to 10 seconds or more
>
> This question could be answered by preparing different histograms,
> with number of images on the vertical axis, and number of seconds on the
> horizontal axis (deciles).
>
>
> *III. Outcomes*
> To answer these questions, we plan to collect data during our upcoming
> pilots on different sites in April.
>
> Based on these pilot results, we will need to make decisions about the
> wider deployments planned for May.
>
> Here are possible outcomes from this study:
>
> Outcome 1: Favorable - e.g.: 80% of images load quickly
> Action: Go ahead with current release plan to deploy Media Viewer
> everywhere by default.
>
> Scenario 2: Neutral - e.g.: 50% of images load quickly
> Action: Go ahead with current release plan, but deploy Media Viewer as
> an opt-in feature on wikis that don't want it by default
>
> Scenario 3: Unfavorable - e.g.: 20% of images load quickly
> Action: Revisit release plan: consider making this opt-in everywhere --
> or work on faster image load solutions.
>
>
> We would be grateful for your comments on this, so we can refine our
> plans before we start this study next week. Please let us know which
> metrics above seem most important, given that we only have a few developer
> days to collect and analyze a few key metrics in coming weeks, to determine
> if we are meeting our objectives. Some related links are included below,
> for your convenience.
>
> To end on a positive note, we just deployed yesterday a new version of
> Media Viewer that is much faster, thanks to all the fine work from our
> development team. This morning, I looked at a variety on 'non-popular'
> images on enwiki today, and the Media Viewer experience was quite good
> overall. Most images load within the 2 second maximum which we
> recommend for a 'fast' connection -- and this was a home wifi connection. I
> realize this is completely anecdotal, and not supported by hard data, so we
> can't make any decisions about it. But it makes me hopeful that we are
> getting close to our objectives. Even compared to large commercial sites
> like Flickr, we hold up pretty well on this computer. :)
>
> Thanks for your interest in this project.
>
> All the best,
>
>
> Fabrice
>
>
> _______________________________
>
>
> *USEFUL LINKS*
>
> * Media Viewer Release Plan:
>
https://www.mediawiki.org/wiki/Multimedia/Media_Viewer/Release_Plan
>
> * First Media Viewer Metrics:
>
http://multimedia-metrics.wmflabs.org/dashboards/mmv Metrics
>
> * Media Viewer Test Page:
>
https://commons.wikimedia.org/wiki/Commons:Lightbox_demo
>
> * Metrics Tasks under consideration (Mingle):
>
>
https://wikimedia.mingle.thoughtworks.com/projects/multimedia/cards?favorit…
>
> * Next Development Cycle (Mingle):
>
http://ur1.ca/gtvvr
>
> * About Media Viewer:
>
https://www.mediawiki.org/wiki/Multimedia/About_Media_Viewer
>
>
> _______________________________
>
> Fabrice Florin
> Product Manager, Multimedia
> Wikimedia Foundation
>
>
http://en.wikipedia.org/wiki/User:Fabrice_Florin_(WMF)
>
>
>
>
>
> _______________________________
>
> Fabrice Florin
> Product Manager
> Wikimedia Foundation
>
>
http://en.wikipedia.org/wiki/User:Fabrice_Florin_(WMF)
>
>
>
>