Gilles:
This event has a "pretty" constant rate of input:
http://graphite.wikimedia.org/render/?width=588&height=311&_salt=14…
And as far as I can see it has not changed (greatly) before and after the
7th. Does this throughput rate match what you see on the database? If so, I
would say db matches incoming stream. Now, if it doesn't (somehow the db
has less data) we might have a problem. However, dropped events when
inserting on the db will affect all countries equally so I doubt these will
be the source of the discrepancy.
Are there a bigger number of events that might not be validating for other
wikis versus China?
Logs are at: stat1002:/a/eventlogging/archive and the all-events log will
have every single one of your events received. Please note that events are
not inserted right away when received, there is a buffer of couple minutes.
This balance shifting over time is really problematic
for tracking Media
Viewer client-side network performance, because Chinese
>traffic suddenly
accounting for a bigger or smaller share of the overall recorded events
creates big swings in the global >averages/percentiles (since network
performance in China is bad).
Wouldn't this happen if your product is "more used" in china than
elsewhere? If you are counting "absolute" values you will be always skewed
to the biggest dataset unless you are doing some kind of scaling.
Thanks,
Nuria
On Wed, Jan 28, 2015 at 4:09 PM, Gilles Dubuc <gilles(a)wikimedia.org> wrote:
Hi all,
I've tracked down an unexplained EL phenomenon that surfaced in our stats
as a false trend in our global stats.
The data I'm looking at specifically is coming from Media Viewer's
MultimediaViewerNetworkPerformance_* tables.
Have a look at this graph:
https://docs.google.com/spreadsheets/d/1PJsyzAyj74dctGCl4-09L7LS4AMZRh57G56…
the big change is on Jan 7th/8th
It shows how many EL events we've recorded, per client-reported country,
over the last 90 days. The sampling factor we use has been constant for
each wiki over that period. Thus, the distribution shouldn't evolve
drastically, aside from seasonal/local trends. Besides the Ukraine spike on
a particular date (probably related to world events), the graph before Jan
7th looks like what you would expect. Then, following the outage that
happened on Jan 7th, not only the balance is completely changed, but it
evolves over time (the US and China are keeping "higher than normal"
levels, while the rest seems to slide down lower than pre-7th quantities),
showing me that something strange is happening and is probably unresolved.
This balance shifting over time is really problematic for tracking Media
Viewer client-side network performance, because Chinese traffic suddenly
accounting for a bigger or smaller share of the overall recorded events
creates big swings in the global averages/percentiles (since network
performance in China is bad).
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics