Re: [Multimedia] [Analytics] EventLogging ballooning

21 May 2014


      ...
...
[gerco] We also want to display global average loading time, which is an average of all the logged loading times
 (which, per above, use different sampling).
...
[gilles] Having every graph and metric possible isn't necessarily a useful goal.
Specific graphs are only worth having if they provide actionable conclusions that can't be found by looking at
other graphs.
Agreed. I was about to send to Gerco a response along the same line. I
think a graph of "global average loading time" is not very useful. The
main point of graphing for performance is to "check" the health of the
system and provide "actionable" data. A global metric like the one you
are describing provides neither in this case. It would be a poor
measure of overall health of the system as it does not represent
closely the user experience when interacting with the system for
neither users with warm or cold cache. And it does not provide clear
actionable data as it is too much a "bird view" of the system. You
would need to drill the percentile data per wiki to find actionable
items.
...
...
[gerco] We might event want to display per-country loading times,
[gilles] There's plenty of useful data on metrics with decent sample sizes,
I think that trying to increase the sample size of each small metric for each small country is a little futile.
Also agree here. If there is a true use case for which we need this
information we can work on it but let's not drown ourselves on data,
initial per wiki percentile graphs are likely to provide many
actionable points.
On Wed, May 21, 2014 at 10:13 AM, Gilles Dubuc gilles@wikimedia.org wrote:
...
...
The duration log shows
I think you're focusing too much on the duration log which isn't graphed
yet. Implementing graphs for that data has been constantly postponed in our
cycle planning because it's been considered lower priority than the rest. We
can focus on challenges specific to that data whenever it gets picked up.
...
We also want to display global average loading time, which is an average
of all the logged loading times (which, per above, use different sampling).
We might event want to display per-country loading times, which is an even
more random mix of data from different wikis.
Having every graph and metric possible isn't necessarily a useful goal.
Specific graphs are only worth having if they provide actionable conclusions
that can't be found by looking at other graphs. For example, not being able
to generate global graphs isn't that big a deal if we can draw the same
conclusions they would provide by looking at the graphs of very large wikis.
An entertaining graph isn't necessarily useful.
At this point the action log is the only one likely to have mixed sampling,
but we only use that one for totals, not averages/percentiles. The only
metrics we're displaying averages and percentiles for have consistent
sampling across all wikis. Even for the duration log, there is consistent
sampling at the moment, and it's so similar to the other sampled metrics we
currently have that I don't foresee the need to introduced mixed sampling.
As for adapting the consistent sampling we currently have on our sampled
logs to improve the accuracy of metrics on small countries/small wikis where
the sample size is too small, is it really useful? Are we likely to find
that increasing the accuracy of the measurement of a specific metric in a
given African country will tell us something we don't already know? There's
plenty of useful data on metrics with decent sample sizes, I think that
trying to increase the sample size of each small metric for each small
country is a little futile.
On Tue, May 20, 2014 at 8:38 PM, Gergo Tisza gtisza@wikimedia.org wrote:
...
On Tue, May 20, 2014 at 6:18 AM, Nuria Ruiz nuria@wikimedia.org wrote:
...
...
...
...
[gerco] - whenever we display geometric means, we weight by sampling
rate (exp(sum(sampling_rate * ln(value)) / sum(sampling_rate)) instead of
exp(avg(ln(value))))
...
...
[gilles] I don't follow the logic here. Like percentiles, averages
should be unaffected by sampling, geometric or not.
...
[gerco]Assume we have 10 duration logs with  1 sec time and 10 with 2
sec; the (arithmetic) mean is 1.5 sec. If the >second group is sampled 1:10,
and we take the average of that, that would give 1.1 sec; our one sample
from the >second group really represents 10 events, but only has the weight
of one. The same logic should hold for geometric >means.
What variable are we measuring with this data that we are averaging?
The duration log shows the total time it takes for the viewer to load
itself and the image data (milliseconds between clicking on the thumbnail
and displaying the image).
We want to sample this on large wikis since it generates a lot of data.
We want to not sample this on small wikis since they generate very little
data and the sampling would make it unreliable.
We want to display average loading time for each wiki as decisions to
enable/disable by default on that wiki should be informed by that stat (some
wikis can have very different loading times due to network geographics).
We also want to display global average loading time, which is an average
of all the logged loading times (which, per above, use different sampling).
We might event want to display per-country loading times, which is an even
more random mix of data from different wikis.

Multimedia mailing list
Multimedia@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/multimedia

Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

Re: [Multimedia] [Analytics] EventLogging ballooning