Okay, so:
I took an hour from the pageviews logs,[0] and aggregated pageviews to
enwiki (mobile and desktop both) by timestamp, down to one-second
resolution levels. The lowest number of pageviews to enwiki per second
was 2,981
So, I don't personally have a problem with generating a release of:
1. Pageviews per second;
2. To enwiki;
3. Over $TIME_PERIOD;
4. grouping the mobile and desktop site
But Dario or someone should chip in before I touch anything ;p
6am yesterday. 6am because it should be low-traffic, right? At least
given our biases towards north america and europe
On 13 April 2015 at 11:54, Oliver Keyes <okeyes(a)wikimedia.org> wrote:
> Then that sounds much more viable. I'll run a quick test now to see
> how much clustering we'd see at, say, the one-second resolution level,
> and throw it out here so we can make more informed decisions about a
> data release on this.
>
> On 13 April 2015 at 08:08, Hirav Gandhi <hirav.gandhi(a)gmail.com> wrote:
>> Hi Oliver,
>>
>> Re: Hirav: would you be looking for temporally /and/ contextually granular
>> pageviews, i.e. "a view to X page at Y time", or just temporally
granular,
>> so "a view to a page on enwiki at X time"? If the latter you've got
more of
>> a shot, I suspect.
>>
>> I only want the latter - I am not concerned with the context so much as just
>> “a view to a page on enwiki at X time.”
>>
>> Hirav
>>
>>
>> On Apr 13, 2015, at 5:00 AM, analytics-request(a)lists.wikimedia.org wrote:
>>
>> Send Analytics mailing list submissions to
>> analytics(a)lists.wikimedia.org
>>
>> To subscribe or unsubscribe via the World Wide Web, visit
>>
https://lists.wikimedia.org/mailman/listinfo/analytics
>> or, via email, send a message with subject or body 'help' to
>> analytics-request(a)lists.wikimedia.org
>>
>> You can reach the person managing the list at
>> analytics-owner(a)lists.wikimedia.org
>>
>> When replying, please edit your Subject line so it is more specific
>> than "Re: Contents of Analytics digest..."
>>
>>
>> Today's Topics:
>>
>> 1. Re: Page views on a more frequent than hourly basis (Pine W)
>> 2. Re: Page views on a more frequent than hourly basis (Oliver Keyes)
>>
>>
>> ----------------------------------------------------------------------
>>
>> Message: 1
>> Date: Mon, 13 Apr 2015 00:47:31 -0700
>> From: Pine W <wiki.pine(a)gmail.com>
>> To: "A mailing list for the Analytics Team at WMF and everybody who
>> has an interest in Wikipedia and analytics."
>> <analytics(a)lists.wikimedia.org>
>> Cc: Bharath Sitaraman <bharath1028(a)gmail.com>
>> Subject: Re: [Analytics] Page views on a more frequent than hourly
>> basis
>> Message-ID:
>> <CAF=dyJgNUT+t6n6muJq16DuYiWP7et6ruHT3_-TZDnseP+29QQ(a)mail.gmail.com>
>> Content-Type: text/plain; charset="utf-8"
>>
>>
>> Hi,
>>
>> This issue of pageview data granularity has been discussed before, and the
>> answer has been that hourly is the smallest increment allowed to be
>> revealed publicly, for privacy reasons.
>>
>> I believe that the person you will want to discuss your request with is
>> Toby, who I have cc'd here.
>>
>> Pine
>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <hirav.gandhi(a)gmail.com>
wrote:
>>
>> Hi Wikimedia Analytics Team,
>>
>> My colleague Bharath and I are doing research on dynamic server allocation
>> algorithms and we were looking for a suitable datasets to test our
>> predictive algorithm on. We noticed that Wikimedia has an amazing data set
>> of hourly page views, but we were looking for something a bit more
>> granular, such as aggregated page requests to English Wikipedia on a minute
>> by minute basis or second by second basis if possible.
>>
>> We are more than happy to pour through any raw data you might have that
>> would help us calculate page requests at this granular level. Please let us
>> know if it would be possible to get such data and if so how. Thank you in
>> advance for your help.
>>
>> Best,
>>
>> Hirav Gandhi
>> _______________________________________________
>> Analytics mailing list
>> Analytics(a)lists.wikimedia.org
>>
https://lists.wikimedia.org/mailman/listinfo/analytics
>>
>>