Can we compare the monthly totals?
On Thu, Mar 12, 2015 at 3:29 PM, Oliver Keyes <okeyes(a)wikimedia.org> wrote:
Well, again; the wikistats data that Erik refers to doesn't have
any granularity within the period this dataset covers. Monthly data
misses sub-monthly noise - like a massive transition that only
kicks in on the day-by-day.
On 12 March 2015 at 18:21, Toby Negrin <tnegrin(a)wikimedia.org> wrote:
> I'm also confused. As I understand it,
stats.wikimedia.org is
> consuming the data that is represented by the green line in your
> graph. Therefore we would see this drop in the wikistats data
> that Erik referred to, but we don't.
> I
> think we need to understand why this is so.
>
> -Toby
>
> On Thu, Mar 12, 2015 at 3:10 PM, Oliver Keyes
> <okeyes(a)wikimedia.org>
> wrote:
>>
>> Well, I'm no longer our resident anything expert, merely /a/
>> anything expert :).
>>
>> The "concoction", as you put it, comes from the
>> webrequest_all_sites data that is consumed by
>> stats.wikimedia.org's primary report - I can't speak for how the
dashboard you're linking to is constructed.
>> Perhaps you could? I doubt this is a "concoction" problem given
>> that, as you will note if you've studied the visualisations,
>> both the UDF and the hive query implementation (which were
>> written by two different people, and code reviewed by two /more/
>> people) agree that this dramatic, unexplained and untracked drop
>> happened. And, since we've been using the hive query
>> implementation for all our high-level numbers for about six
>> months, a bug of this magnitude in the /implementation/ of the definition would
be....worrying.
>>
>> Indeed, your report says 20B per month (again, is it drawing
>> from the same data source as the aggregate, high-level number?)
>> - I never claimed 1.1B a day, you did. Instead, it started off
>> as approximately 1.1-1.2Bn, before dropping down to between 600m
>> and 700m, where it has resided ever since. That sounds,
>> averaged, like approximately 0.75B, no? The disadvantage of
>> comparing a single monthly number against a more granular dataset.
>>
>> On 12 March 2015 at 17:55, Erik Zachte <ezachte(a)wikimedia.org> wrote:
>> > I'd rather see you explain this, Oliver, as our incumbent page
>> > views expert.
>> > Your concoction of legacy PV seems to suggest 'Old definition,
UDF'
>> > was
>> > about 1.1B per day.
>> >
>> > Yet
>> >
http://stats.wikimedia.org/EN/TablesPageViewsMonthlyAllProject
>> > s .htm shows 20B per month, 0.75B per day
>> >
>> > Erik
>> >
>> > -----Original Message-----
>> > From: analytics-bounces(a)lists.wikimedia.org
>> > [mailto:analytics-bounces@lists.wikimedia.org] On Behalf Of
>> > Oliver Keyes
>> > Sent: Thursday, March 12, 2015 19:38
>> > To: A mailing list for the Analytics Team at WMF and everybody
>> > who has an interest in Wikipedia and analytics.
>> > Subject: [Analytics] [Technical] final pageviews QA
>> >
>> > Hey all,
>> >
>> > After the patches to the definition following the previous
>> > hand-coding run (see older threads) I've run a second set of
>> > tests. These can be seen at
>> >
https://commons.wikimedia.org/wiki/File:Pageviews_QA_2.png and
>> >
https://commons.wikimedia.org/wiki/File:Pageviews_QA_jittered_
>> > 2
>> > .png
>> >
>> > There's nothing particularly shocking in the new definition;
>> > it follows the seasonal pattern that we're used to. I think we
>> > can call the new definition done, with these tweaks! It's also
>> > not as unstable as the legacy definition (good luck to whoever
>> > now has the responsibility of explaining why pageviews
>> > abruptly halved in the middle of February).
>> >
>> >
>> > Have fun,
>> > --
>> > Oliver Keyes
>> > Research Analyst
>> > Wikimedia Foundation
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > Analytics(a)lists.wikimedia.org
>> >
https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > Analytics(a)lists.wikimedia.org
>> >
https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>
>>
>> --
>> Oliver Keyes
>> Research Analyst
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics(a)lists.wikimedia.org
>>
https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/analytics
>
--
Oliver Keyes
Research Analyst
Wikimedia Foundation
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org