Well, I'm no longer our resident anything expert, merely /a/ anything expert :).
The "concoction", as you put it, comes from the webrequest_all_sites
data that is consumed by stats.wikimedia.org's primary report - I
can't speak for how the dashboard you're linking to is constructed.
Perhaps you could? I doubt this is a "concoction" problem given that,
as you will note if you've studied the visualisations, both the UDF
and the hive query implementation (which were written by two different
people, and code reviewed by two /more/ people) agree that this
dramatic, unexplained and untracked drop happened. And, since we've
been using the hive query implementation for all our high-level
numbers for about six months, a bug of this magnitude in the
/implementation/ of the definition would be....worrying.
Indeed, your report says 20B per month (again, is it drawing from the
same data source as the aggregate, high-level number?) - I never
claimed 1.1B a day, you did. Instead, it started off as approximately
1.1-1.2Bn, before dropping down to between 600m and 700m, where it has
resided ever since. That sounds, averaged, like approximately 0.75B,
no? The disadvantage of comparing a single monthly number against a
more granular dataset.
On 12 March 2015 at 17:55, Erik Zachte <ezachte(a)wikimedia.org> wrote:
I'd rather see you explain this, Oliver, as our
incumbent page views expert.
Your concoction of legacy PV seems to suggest 'Old definition, UDF' was about
1.1B per day.
Yet
http://stats.wikimedia.org/EN/TablesPageViewsMonthlyAllProjects.htm shows 20B per
month, 0.75B per day
Erik
-----Original Message-----
From: analytics-bounces(a)lists.wikimedia.org
[mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of Oliver Keyes
Sent: Thursday, March 12, 2015 19:38
To: A mailing list for the Analytics Team at WMF and everybody who has an interest in
Wikipedia and analytics.
Subject: [Analytics] [Technical] final pageviews QA
Hey all,
After the patches to the definition following the previous hand-coding run (see older
threads) I've run a second set of tests. These can be seen at
https://commons.wikimedia.org/wiki/File:Pageviews_QA_2.png and
https://commons.wikimedia.org/wiki/File:Pageviews_QA_jittered_2.png
There's nothing particularly shocking in the new definition; it follows the seasonal
pattern that we're used to. I think we can call the new definition done, with these
tweaks! It's also not as unstable as the legacy definition (good luck to whoever now
has the responsibility of explaining why pageviews abruptly halved in the middle of
February).
Have fun,
--
Oliver Keyes
Research Analyst
Wikimedia Foundation
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
--
Oliver Keyes
Research Analyst
Wikimedia Foundation