Then that sounds much more viable. I'll run a quick test now to see
how much clustering we'd see at, say, the one-second resolution level,
and throw it out here so we can make more informed decisions about a
data release on this.
On 13 April 2015 at 08:08, Hirav Gandhi <hirav.gandhi(a)gmail.com> wrote:
> Hi Oliver,
>
> Re: Hirav: would you be looking for temporally /and/ contextually granular
> pageviews, i.e. "a view to X page at Y time", or just temporally granular,
> so "a view to a page on enwiki at X time"? If the latter you've got
more of
> a shot, I suspect.
>
> I only want the latter - I am not concerned with the context so much as just
> “a view to a page on enwiki at X time.”
>
> Hirav
>
>
> On Apr 13, 2015, at 5:00 AM, analytics-request(a)lists.wikimedia.org wrote:
>
> Send Analytics mailing list submissions to
> analytics(a)lists.wikimedia.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
>
https://lists.wikimedia.org/mailman/listinfo/analytics
> or, via email, send a message with subject or body 'help' to
> analytics-request(a)lists.wikimedia.org
>
> You can reach the person managing the list at
> analytics-owner(a)lists.wikimedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Analytics digest..."
>
>
> Today's Topics:
>
> 1. Re: Page views on a more frequent than hourly basis (Pine W)
> 2. Re: Page views on a more frequent than hourly basis (Oliver Keyes)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Mon, 13 Apr 2015 00:47:31 -0700
> From: Pine W <wiki.pine(a)gmail.com>
> To: "A mailing list for the Analytics Team at WMF and everybody who
> has an interest in Wikipedia and analytics."
> <analytics(a)lists.wikimedia.org>
> Cc: Bharath Sitaraman <bharath1028(a)gmail.com>
> Subject: Re: [Analytics] Page views on a more frequent than hourly
> basis
> Message-ID:
> <CAF=dyJgNUT+t6n6muJq16DuYiWP7et6ruHT3_-TZDnseP+29QQ(a)mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
>
> Hi,
>
> This issue of pageview data granularity has been discussed before, and the
> answer has been that hourly is the smallest increment allowed to be
> revealed publicly, for privacy reasons.
>
> I believe that the person you will want to discuss your request with is
> Toby, who I have cc'd here.
>
> Pine
> On Apr 13, 2015 12:11 AM, "Hirav Gandhi" <hirav.gandhi(a)gmail.com>
wrote:
>
> Hi Wikimedia Analytics Team,
>
> My colleague Bharath and I are doing research on dynamic server allocation
> algorithms and we were looking for a suitable datasets to test our
> predictive algorithm on. We noticed that Wikimedia has an amazing data set
> of hourly page views, but we were looking for something a bit more
> granular, such as aggregated page requests to English Wikipedia on a minute
> by minute basis or second by second basis if possible.
>
> We are more than happy to pour through any raw data you might have that
> would help us calculate page requests at this granular level. Please let us
> know if it would be possible to get such data and if so how. Thank you in
> advance for your help.
>
> Best,
>
> Hirav Gandhi
> _______________________________________________
> Analytics mailing list
> Analytics(a)lists.wikimedia.org
>
https://lists.wikimedia.org/mailman/listinfo/analytics
>
>