And thanks for doing all of this for us! We do greatly appreciate it!

Cheers,
Bharath

On Wed, Apr 15, 2015 at 10:40 AM, Bharath Sitaraman <bharath@cs.stanford.edu> wrote:
Interested in an Erlang book? :P Pretty sure I have one of those laying around here...

Cheers,
Bharath

On Wed, Apr 15, 2015 at 10:38 AM, Oliver Keyes <okeyes@wikimedia.org> wrote:
I accept payment in books, pull requests and speaking invitations ;p.

(Updated check-the-minimum query running now!)

On 15 April 2015 at 13:35, Hirav Gandhi <hirav.gandhi@gmail.com> wrote:
> Sorry Oliver. Let me know where I can send the beer/coffee money to
> compensate you for the hard work :)
>
>
>
> On Wed, Apr 15, 2015 at 10:34 AM, Oliver Keyes <okeyes@wikimedia.org> wrote:
>>
>> /This/ you say 2.5 seconds after I've launched the query ;p. Yes, it
>> is possible, but I'll have to recalculate the likely minimum and check
>> that it's still okay.
>>
>> On 15 April 2015 at 13:32, Hirav Gandhi <hirav.gandhi@gmail.com> wrote:
>> > Hi Dario,
>> >
>> > One last question - would it be possible to break it out into mobile vs
>> > desktop? We are also concerned there might be seasonality effects in
>> > there
>> > as well. Please let us know.
>> >
>> > Best,
>> >
>> > Hirav
>> >
>> >
>> >
>> > On Wed, Apr 15, 2015 at 10:27 AM, Dario Taraborelli
>> > <dtaraborelli@wikimedia.org> wrote:
>> >>
>> >> thanks, both. Let's go ahead with English only and no spiders filtered
>> >> or
>> >> mobile/desktop breakdown, per Oliver.
>> >>
>> >> Michelle – given the aggregation level I am fine moving forward with
>> >> this
>> >> release, but let me know off-thread if you have any questions.
>> >>
>> >> Dario
>> >>
>> >> On Wed, Apr 15, 2015 at 9:53 AM, Oliver Keyes <okeyes@wikimedia.org>
>> >> wrote:
>> >>>
>> >>> Dario,
>> >>>
>> >>> No spider filtering, and no split between mobile and desktop; mobile
>> >>> and desktop are grouped.
>> >>>
>> >>> On 15 April 2015 at 12:46, Hirav Gandhi <hirav.gandhi@gmail.com>
>> >>> wrote:
>> >>> > e.g. German*
>> >>> >
>> >>> > I need more coffee.
>> >>> >
>> >>> >
>> >>> >
>> >>> > On Wed, Apr 15, 2015 at 9:35 AM, Hirav Gandhi
>> >>> > <hirav.gandhi@gmail.com>
>> >>> > wrote:
>> >>> >>
>> >>> >> Dario - we just want a representative samples of traffic for a
>> >>> >> popular
>> >>> >> site like Wikipedia. We thought limiting to the English Wikipedia
>> >>> >> would be
>> >>> >> easier.
>> >>> >>
>> >>> >> If we get aggregated data across all language Wikipedia sites, we
>> >>> >> would
>> >>> >> need someway to tease out which language is being queried when.
>> >>> >> Some
>> >>> >> languages (for e.g. German) we would hypothesize would have more
>> >>> >> daily
>> >>> >> seasonality than languages like English.
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> On Wed, Apr 15, 2015 at 9:32 AM, Dario Taraborelli
>> >>> >> <dtaraborelli@wikimedia.org> wrote:
>> >>> >>>
>> >>> >>> Hirav, Bharath – I also want to hear from you if there's a
>> >>> >>> specific
>> >>> >>> reason to ask for English Wikipedia only or if a dataset
>> >>> >>> encompassing
>> >>> >>> aggregate pageviews across all Wikimedia properties would do the
>> >>> >>> job.
>> >>> >>>
>> >>> >>> Dario
>> >>> >>>
>> >>> >>> On Wed, Apr 15, 2015 at 9:09 AM, Dario Taraborelli
>> >>> >>> <dtaraborelli@wikimedia.org> wrote:
>> >>> >>>>
>> >>> >>>> Oliver -- thanks for running a preliminary check, I'm fine
>> >>> >>>> releasing
>> >>> >>>> this data in aggregate under CC0, I believe it would be valuable
>> >>> >>>> for
>> >>> >>>> this
>> >>> >>>> and other research projects (copying Michelle from Legal).
>> >>> >>>>
>> >>> >>>> Before we do so, though, I want to confirm the specs: aggregate
>> >>> >>>> pageviews per second to English Wikipedia, excluding bot traffic,
>> >>> >>>> broken
>> >>> >>>> down by access method (mobile web vs desktop site, not apps) for
>> >>> >>>> a
>> >>> >>>> 60-day
>> >>> >>>> period. Oliver – are these the filters you used to identify the
>> >>> >>>> data
>> >>> >>>> point
>> >>> >>>> with the smallest number of observations?
>> >>> >>>>
>> >>> >>>> Obviously, we will need to take into account this release when we
>> >>> >>>> start
>> >>> >>>> working on projects such as
>> >>> >>>>
>> >>> >>>>
>> >>> >>>> https://meta.wikimedia.org/wiki/Research:Geo-aggregation_of_Wikipedia_edits
>> >>> >>>> and
>> >>> >>>>
>> >>> >>>>
>> >>> >>>> https://meta.wikimedia.org/wiki/Research:Geo-aggregation_of_Wikipedia_pageviews
>> >>> >>>>
>> >>> >>>> Dario
>> >>> >>>>
>> >>> >>>> On Mon, Apr 13, 2015 at 9:37 PM, Oliver Keyes
>> >>> >>>> <okeyes@wikimedia.org>
>> >>> >>>> wrote:
>> >>> >>>>>
>> >>> >>>>> Bumping for Dario, per Pine's excellent example :)
>> >>> >>>>>
>> >>> >>>>> On 13 April 2015 at 22:18, Hirav Gandhi <hirav.gandhi@gmail.com>
>> >>> >>>>> wrote:
>> >>> >>>>> > Oliver: Two months is fine. Thank you so much for your help!
>> >>> >>>>> >
>> >>> >>>>> >> On Apr 13, 2015, at 4:40 PM,
>> >>> >>>>> >> analytics-request@lists.wikimedia.org
>> >>> >>>>> >> wrote:
>> >>> >>>>> >>
>> >>> >>>>> >> Send Analytics mailing list submissions to
>> >>> >>>>> >> analytics@lists.wikimedia.org
>> >>> >>>>> >>
>> >>> >>>>> >> To subscribe or unsubscribe via the World Wide Web, visit
>> >>> >>>>> >> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >> or, via email, send a message with subject or body 'help' to
>> >>> >>>>> >> analytics-request@lists.wikimedia.org
>> >>> >>>>> >>
>> >>> >>>>> >> You can reach the person managing the list at
>> >>> >>>>> >> analytics-owner@lists.wikimedia.org
>> >>> >>>>> >>
>> >>> >>>>> >> When replying, please edit your Subject line so it is more
>> >>> >>>>> >> specific
>> >>> >>>>> >> than "Re: Contents of Analytics digest..."
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> Today's Topics:
>> >>> >>>>> >>
>> >>> >>>>> >> 1. Re: Page views on a more frequent than hourly basis (Pine
>> >>> >>>>> >> W)
>> >>> >>>>> >> 2. Re: Page views on a more frequent than hourly basis (Hirav
>> >>> >>>>> >> Gandhi)
>> >>> >>>>> >> 3. Re: Page views on a more frequent than hourly basis
>> >>> >>>>> >> (Oliver
>> >>> >>>>> >> Keyes)
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> ----------------------------------------------------------------------
>> >>> >>>>> >>
>> >>> >>>>> >> Message: 1
>> >>> >>>>> >> Date: Mon, 13 Apr 2015 13:34:23 -0700
>> >>> >>>>> >> From: Pine W <wiki.pine@gmail.com>
>> >>> >>>>> >> To: "A mailing list for the Analytics Team at WMF and
>> >>> >>>>> >> everybody
>> >>> >>>>> >> who
>> >>> >>>>> >> has an interest in Wikipedia and analytics."
>> >>> >>>>> >> <analytics@lists.wikimedia.org>
>> >>> >>>>> >> Subject: Re: [Analytics] Page views on a more frequent than
>> >>> >>>>> >> hourly
>> >>> >>>>> >> basis
>> >>> >>>>> >> Message-ID:
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> <CAF=dyJjZMdfTHZ+0+LwnHb9m8xUOd4WetGCFUXYB9Qyf7CyC5Q@mail.gmail.com>
>> >>> >>>>> >> Content-Type: text/plain; charset="utf-8"
>> >>> >>>>> >>
>> >>> >>>>> >> Hi Oliver, re ccing people who are on list, this is the
>> >>> >>>>> >> protocol
>> >>> >>>>> >> we
>> >>> >>>>> >> followed in IEGCom to ping people who are subscribed and
>> >>> >>>>> >> mentioned
>> >>> >>>>> >> in
>> >>> >>>>> >> certain emails but, like many of us, may automatically move
>> >>> >>>>> >> emails
>> >>> >>>>> >> from
>> >>> >>>>> >> lists directly to folders where they may be unread for days.
>> >>> >>>>> >> So
>> >>> >>>>> >> there is a
>> >>> >>>>> >> reason to do this.
>> >>> >>>>> >>
>> >>> >>>>> >> Thanks,
>> >>> >>>>> >>
>> >>> >>>>> >> Pine
>> >>> >>>>> >> -------------- next part --------------
>> >>> >>>>> >> An HTML attachment was scrubbed...
>> >>> >>>>> >> URL:
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> <https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/aac0ef89/attachment-0001.html>
>> >>> >>>>> >>
>> >>> >>>>> >> ------------------------------
>> >>> >>>>> >>
>> >>> >>>>> >> Message: 2
>> >>> >>>>> >> Date: Mon, 13 Apr 2015 16:30:43 -0700
>> >>> >>>>> >> From: Hirav Gandhi <hirav.gandhi@gmail.com>
>> >>> >>>>> >> To: analytics@lists.wikimedia.org
>> >>> >>>>> >> Subject: Re: [Analytics] Page views on a more frequent than
>> >>> >>>>> >> hourly
>> >>> >>>>> >> basis
>> >>> >>>>> >> Message-ID:
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> <CANzC_EOvi4MP7G_SsxvW=UOjPt2vXbNfMHcipqN1pumACE-eEw@mail.gmail.com>
>> >>> >>>>> >> Content-Type: text/plain; charset="utf-8"
>> >>> >>>>> >>
>> >>> >>>>> >> Thanks Oliver!
>> >>> >>>>> >>
>> >>> >>>>> >> We would like this data for as broad of a time period as you
>> >>> >>>>> >> can
>> >>> >>>>> >> muster.
>> >>> >>>>> >> The more days, months and year represented in the dataset,
>> >>> >>>>> >> the
>> >>> >>>>> >> better.
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >>> Okay, so:
>> >>> >>>>> >>>
>> >>> >>>>> >>> I took an hour from the pageviews logs,[0] and aggregated
>> >>> >>>>> >>> pageviews
>> >>> >>>>> >>> to
>> >>> >>>>> >>> enwiki (mobile and desktop both) by timestamp, down to
>> >>> >>>>> >>> one-second
>> >>> >>>>> >>> resolution levels. The lowest number of pageviews to enwiki
>> >>> >>>>> >>> per
>> >>> >>>>> >>> second
>> >>> >>>>> >>> was 2,981
>> >>> >>>>> >>>
>> >>> >>>>> >>> So, I don't personally have a problem with generating a
>> >>> >>>>> >>> release
>> >>> >>>>> >>> of:
>> >>> >>>>> >>>
>> >>> >>>>> >>> 1. Pageviews per second;
>> >>> >>>>> >>> 2. To enwiki;
>> >>> >>>>> >>> 3. Over $TIME_PERIOD;
>> >>> >>>>> >>> 4. grouping the mobile and desktop site
>> >>> >>>>> >>>
>> >>> >>>>> >>> But Dario or someone should chip in before I touch anything
>> >>> >>>>> >>> ;p
>> >>> >>>>> >>>
>> >>> >>>>> >>> 6am yesterday. 6am because it should be low-traffic, right?
>> >>> >>>>> >>> At
>> >>> >>>>> >>> least
>> >>> >>>>> >>> given our biases towards north america and europe
>> >>> >>>>> >>>
>> >>> >>>>> >>> On 13 April 2015 at 11:54, Oliver Keyes
>> >>> >>>>> >>> <okeyes@wikimedia.org>
>> >>> >>>>> >>> wrote:
>> >>> >>>>> >>>> Then that sounds much more viable. I'll run a quick test
>> >>> >>>>> >>>> now
>> >>> >>>>> >>>> to
>> >>> >>>>> >>>> see
>> >>> >>>>> >>>> how much clustering we'd see at, say, the one-second
>> >>> >>>>> >>>> resolution
>> >>> >>>>> >>>> level,
>> >>> >>>>> >>>> and throw it out here so we can make more informed
>> >>> >>>>> >>>> decisions
>> >>> >>>>> >>>> about
>> >>> >>>>> >>>> a
>> >>> >>>>> >>>> data release on this.
>> >>> >>>>> >>>>
>> >>> >>>>> >>>> On 13 April 2015 at 08:08, Hirav Gandhi
>> >>> >>>>> >>>> <hirav.gandhi@gmail.com>
>> >>> >>>>> >>>> wrote:
>> >>> >>>>> >>>>> Hi Oliver,
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Re: Hirav: would you be looking for temporally /and/
>> >>> >>>>> >>>>> contextually
>> >>> >>>>> >>> granular
>> >>> >>>>> >>>>> pageviews, i.e. "a view to X page at Y time", or just
>> >>> >>>>> >>>>> temporally
>> >>> >>>>> >>> granular,
>> >>> >>>>> >>>>> so "a view to a page on enwiki at X time"? If the latter
>> >>> >>>>> >>>>> you've
>> >>> >>>>> >>>>> got
>> >>> >>>>> >>> more of
>> >>> >>>>> >>>>> a shot, I suspect.
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> I only want the latter - I am not concerned with the
>> >>> >>>>> >>>>> context
>> >>> >>>>> >>>>> so
>> >>> >>>>> >>>>> much as
>> >>> >>>>> >>> just
>> >>> >>>>> >>>>> “a view to a page on enwiki at X time.”
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Hirav
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> On Apr 13, 2015, at 5:00 AM,
>> >>> >>>>> >>>>> analytics-request@lists.wikimedia.org
>> >>> >>>>> >>> wrote:
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Send Analytics mailing list submissions to
>> >>> >>>>> >>>>> analytics@lists.wikimedia.org
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> To subscribe or unsubscribe via the World Wide Web, visit
>> >>> >>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>>> or, via email, send a message with subject or body 'help'
>> >>> >>>>> >>>>> to
>> >>> >>>>> >>>>> analytics-request@lists.wikimedia.org
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> You can reach the person managing the list at
>> >>> >>>>> >>>>> analytics-owner@lists.wikimedia.org
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> When replying, please edit your Subject line so it is more
>> >>> >>>>> >>>>> specific
>> >>> >>>>> >>>>> than "Re: Contents of Analytics digest..."
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Today's Topics:
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> 1. Re: Page views on a more frequent than hourly basis
>> >>> >>>>> >>>>> (Pine
>> >>> >>>>> >>>>> W)
>> >>> >>>>> >>>>> 2. Re: Page views on a more frequent than hourly basis
>> >>> >>>>> >>>>> (Oliver
>> >>> >>>>> >>>>> Keyes)
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> ----------------------------------------------------------------------
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Message: 1
>> >>> >>>>> >>>>> Date: Mon, 13 Apr 2015 00:47:31 -0700
>> >>> >>>>> >>>>> From: Pine W <wiki.pine@gmail.com>
>> >>> >>>>> >>>>> To: "A mailing list for the Analytics Team at WMF and
>> >>> >>>>> >>>>> everybody
>> >>> >>>>> >>>>> who
>> >>> >>>>> >>>>> has an interest in Wikipedia and analytics."
>> >>> >>>>> >>>>> <analytics@lists.wikimedia.org>
>> >>> >>>>> >>>>> Cc: Bharath Sitaraman <bharath1028@gmail.com>
>> >>> >>>>> >>>>> Subject: Re: [Analytics] Page views on a more frequent
>> >>> >>>>> >>>>> than
>> >>> >>>>> >>>>> hourly
>> >>> >>>>> >>>>> basis
>> >>> >>>>> >>>>> Message-ID:
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> <CAF=dyJgNUT+t6n6muJq16DuYiWP7et6ruHT3_-TZDnseP+29QQ@mail.gmail.com>
>> >>> >>>>> >>>>> Content-Type: text/plain; charset="utf-8"
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Hi,
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> This issue of pageview data granularity has been discussed
>> >>> >>>>> >>>>> before, and
>> >>> >>>>> >>> the
>> >>> >>>>> >>>>> answer has been that hourly is the smallest increment
>> >>> >>>>> >>>>> allowed
>> >>> >>>>> >>>>> to
>> >>> >>>>> >>>>> be
>> >>> >>>>> >>>>> revealed publicly, for privacy reasons.
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> I believe that the person you will want to discuss your
>> >>> >>>>> >>>>> request
>> >>> >>>>> >>>>> with is
>> >>> >>>>> >>>>> Toby, who I have cc'd here.
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Pine
>> >>> >>>>> >>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi"
>> >>> >>>>> >>>>> <hirav.gandhi@gmail.com>
>> >>> >>>>> >>> wrote:
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Hi Wikimedia Analytics Team,
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> My colleague Bharath and I are doing research on dynamic
>> >>> >>>>> >>>>> server
>> >>> >>>>> >>> allocation
>> >>> >>>>> >>>>> algorithms and we were looking for a suitable datasets to
>> >>> >>>>> >>>>> test
>> >>> >>>>> >>>>> our
>> >>> >>>>> >>>>> predictive algorithm on. We noticed that Wikimedia has an
>> >>> >>>>> >>>>> amazing
>> >>> >>>>> >>>>> data
>> >>> >>>>> >>> set
>> >>> >>>>> >>>>> of hourly page views, but we were looking for something a
>> >>> >>>>> >>>>> bit
>> >>> >>>>> >>>>> more
>> >>> >>>>> >>>>> granular, such as aggregated page requests to English
>> >>> >>>>> >>>>> Wikipedia
>> >>> >>>>> >>>>> on a
>> >>> >>>>> >>> minute
>> >>> >>>>> >>>>> by minute basis or second by second basis if possible.
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> We are more than happy to pour through any raw data you
>> >>> >>>>> >>>>> might
>> >>> >>>>> >>>>> have that
>> >>> >>>>> >>>>> would help us calculate page requests at this granular
>> >>> >>>>> >>>>> level.
>> >>> >>>>> >>>>> Please
>> >>> >>>>> >>> let us
>> >>> >>>>> >>>>> know if it would be possible to get such data and if so
>> >>> >>>>> >>>>> how.
>> >>> >>>>> >>>>> Thank you
>> >>> >>>>> >>> in
>> >>> >>>>> >>>>> advance for your help.
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Best,
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Hirav Gandhi
>> >>> >>>>> >>>>> _______________________________________________
>> >>> >>>>> >>>>> Analytics mailing list
>> >>> >>>>> >>>>> Analytics@lists.wikimedia.org
>> >>> >>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> -------------- next part --------------
>> >>> >>>>> >>>>> An HTML attachment was scrubbed...
>> >>> >>>>> >>>>> URL:
>> >>> >>>>> >>>>> <
>> >>> >>>>> >>>
>> >>> >>>>> >>>
>> >>> >>>>> >>>
>> >>> >>>>> >>> https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html
>> >>> >>>>> >>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> ------------------------------
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Message: 2
>> >>> >>>>> >>>>> Date: Mon, 13 Apr 2015 06:39:45 -0400
>> >>> >>>>> >>>>> From: Oliver Keyes <okeyes@wikimedia.org>
>> >>> >>>>> >>>>> To: "A mailing list for the Analytics Team at WMF and
>> >>> >>>>> >>>>> everybody
>> >>> >>>>> >>>>> who
>> >>> >>>>> >>>>> has an interest in Wikipedia and analytics."
>> >>> >>>>> >>>>> <analytics@lists.wikimedia.org>
>> >>> >>>>> >>>>> Cc: Bharath Sitaraman <bharath1028@gmail.com>
>> >>> >>>>> >>>>> Subject: Re: [Analytics] Page views on a more frequent
>> >>> >>>>> >>>>> than
>> >>> >>>>> >>>>> hourly
>> >>> >>>>> >>>>> basis
>> >>> >>>>> >>>>> Message-ID:
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-=hxPg@mail.gmail.com>
>> >>> >>>>> >>>>> Content-Type: text/plain; charset=UTF-8
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Preeetty sure that Toby is on the analytics list, Pine.
>> >>> >>>>> >>>>> He's
>> >>> >>>>> >>>>> the
>> >>> >>>>> >>>>> director of analytics.
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Hirav: would you be looking for temporally /and/
>> >>> >>>>> >>>>> contextually
>> >>> >>>>> >>>>> granular
>> >>> >>>>> >>>>> pageviews, i.e. "a view to X page at Y time", or just
>> >>> >>>>> >>>>> temporally
>> >>> >>>>> >>>>> granular, so "a view to a page on enwiki at X time"? If
>> >>> >>>>> >>>>> the
>> >>> >>>>> >>>>> latter
>> >>> >>>>> >>>>> you've got more of a shot, I suspect.
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> On 13 April 2015 at 03:47, Pine W <wiki.pine@gmail.com>
>> >>> >>>>> >>>>> wrote:
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Hi,
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> This issue of pageview data granularity has been discussed
>> >>> >>>>> >>>>> before, and
>> >>> >>>>> >>> the
>> >>> >>>>> >>>>> answer has been that hourly is the smallest increment
>> >>> >>>>> >>>>> allowed
>> >>> >>>>> >>>>> to
>> >>> >>>>> >>>>> be
>> >>> >>>>> >>> revealed
>> >>> >>>>> >>>>> publicly, for privacy reasons.
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> I believe that the person you will want to discuss your
>> >>> >>>>> >>>>> request
>> >>> >>>>> >>>>> with is
>> >>> >>>>> >>>>> Toby, who I have cc'd here.
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Pine
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi"
>> >>> >>>>> >>>>> <hirav.gandhi@gmail.com>
>> >>> >>>>> >>> wrote:
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Hi Wikimedia Analytics Team,
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> My colleague Bharath and I are doing research on dynamic
>> >>> >>>>> >>>>> server
>> >>> >>>>> >>> allocation
>> >>> >>>>> >>>>> algorithms and we were looking for a suitable datasets to
>> >>> >>>>> >>>>> test
>> >>> >>>>> >>>>> our
>> >>> >>>>> >>>>> predictive algorithm on. We noticed that Wikimedia has an
>> >>> >>>>> >>>>> amazing
>> >>> >>>>> >>>>> data
>> >>> >>>>> >>> set
>> >>> >>>>> >>>>> of hourly page views, but we were looking for something a
>> >>> >>>>> >>>>> bit
>> >>> >>>>> >>>>> more
>> >>> >>>>> >>> granular,
>> >>> >>>>> >>>>> such as aggregated page requests to English Wikipedia on a
>> >>> >>>>> >>>>> minute
>> >>> >>>>> >>>>> by
>> >>> >>>>> >>> minute
>> >>> >>>>> >>>>> basis or second by second basis if possible.
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> We are more than happy to pour through any raw data you
>> >>> >>>>> >>>>> might
>> >>> >>>>> >>>>> have that
>> >>> >>>>> >>>>> would help us calculate page requests at this granular
>> >>> >>>>> >>>>> level.
>> >>> >>>>> >>>>> Please
>> >>> >>>>> >>> let us
>> >>> >>>>> >>>>> know if it would be possible to get such data and if so
>> >>> >>>>> >>>>> how.
>> >>> >>>>> >>>>> Thank you
>> >>> >>>>> >>> in
>> >>> >>>>> >>>>> advance for your help.
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Best,
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> Hirav Gandhi
>> >>> >>>>> >>>>> _______________________________________________
>> >>> >>>>> >>>>> Analytics mailing list
>> >>> >>>>> >>>>> Analytics@lists.wikimedia.org
>> >>> >>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> _______________________________________________
>> >>> >>>>> >>>>> Analytics mailing list
>> >>> >>>>> >>>>> Analytics@lists.wikimedia.org
>> >>> >>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> --
>> >>> >>>>> >>>>> Oliver Keyes
>> >>> >>>>> >>>>> Research Analyst
>> >>> >>>>> >>>>> Wikimedia Foundation
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> ------------------------------
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> _______________________________________________
>> >>> >>>>> >>>>> Analytics mailing list
>> >>> >>>>> >>>>> Analytics@lists.wikimedia.org
>> >>> >>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> End of Analytics Digest, Vol 38, Issue 21
>> >>> >>>>> >>>>> *****************************************
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> _______________________________________________
>> >>> >>>>> >>>>> Analytics mailing list
>> >>> >>>>> >>>>> Analytics@lists.wikimedia.org
>> >>> >>>>> >>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>
>> >>> >>>>> >>>>
>> >>> >>>>> >>>>
>> >>> >>>>> >>>> --
>> >>> >>>>> >>>> Oliver Keyes
>> >>> >>>>> >>>> Research Analyst
>> >>> >>>>> >>>> Wikimedia Foundation
>> >>> >>>>> >>>
>> >>> >>>>> >>>
>> >>> >>>>> >>>
>> >>> >>>>> >>> --
>> >>> >>>>> >>> Oliver Keyes
>> >>> >>>>> >>> Research Analyst
>> >>> >>>>> >>> Wikimedia Foundation
>> >>> >>>>> >>>
>> >>> >>>>> >>>
>> >>> >>>>> >>>
>> >>> >>>>> >>> ------------------------------
>> >>> >>>>> >>>
>> >>> >>>>> >>> _______________________________________________
>> >>> >>>>> >>> Analytics mailing list
>> >>> >>>>> >>> Analytics@lists.wikimedia.org
>> >>> >>>>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>
>> >>> >>>>> >> -------------- next part --------------
>> >>> >>>>> >> An HTML attachment was scrubbed...
>> >>> >>>>> >> URL:
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> <https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/3a5df491/attachment-0001.html>
>> >>> >>>>> >>
>> >>> >>>>> >> ------------------------------
>> >>> >>>>> >>
>> >>> >>>>> >> Message: 3
>> >>> >>>>> >> Date: Mon, 13 Apr 2015 19:40:04 -0400
>> >>> >>>>> >> From: Oliver Keyes <okeyes@wikimedia.org>
>> >>> >>>>> >> To: "A mailing list for the Analytics Team at WMF and
>> >>> >>>>> >> everybody
>> >>> >>>>> >> who
>> >>> >>>>> >> has an interest in Wikipedia and analytics."
>> >>> >>>>> >> <analytics@lists.wikimedia.org>
>> >>> >>>>> >> Subject: Re: [Analytics] Page views on a more frequent than
>> >>> >>>>> >> hourly
>> >>> >>>>> >> basis
>> >>> >>>>> >> Message-ID:
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> <CAAUQgdD6Z5USsu11VW49fDMBSrhYEjxKU9yOPySEriB79J-5Cg@mail.gmail.com>
>> >>> >>>>> >> Content-Type: text/plain; charset=UTF-8
>> >>> >>>>> >>
>> >>> >>>>> >> ....
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> ...years?
>> >>> >>>>> >>
>> >>> >>>>> >> We have unsampled logs for, ah. 2 months.
>> >>> >>>>> >>
>> >>> >>>>> >> On 13 April 2015 at 19:30, Hirav Gandhi
>> >>> >>>>> >> <hirav.gandhi@gmail.com>
>> >>> >>>>> >> wrote:
>> >>> >>>>> >>> Thanks Oliver!
>> >>> >>>>> >>>
>> >>> >>>>> >>> We would like this data for as broad of a time period as you
>> >>> >>>>> >>> can
>> >>> >>>>> >>> muster. The
>> >>> >>>>> >>> more days, months and year represented in the dataset, the
>> >>> >>>>> >>> better.
>> >>> >>>>> >>>
>> >>> >>>>> >>>>
>> >>> >>>>> >>>> Okay, so:
>> >>> >>>>> >>>>
>> >>> >>>>> >>>> I took an hour from the pageviews logs,[0] and aggregated
>> >>> >>>>> >>>> pageviews to
>> >>> >>>>> >>>> enwiki (mobile and desktop both) by timestamp, down to
>> >>> >>>>> >>>> one-second
>> >>> >>>>> >>>> resolution levels. The lowest number of pageviews to enwiki
>> >>> >>>>> >>>> per
>> >>> >>>>> >>>> second
>> >>> >>>>> >>>> was 2,981
>> >>> >>>>> >>>>
>> >>> >>>>> >>>> So, I don't personally have a problem with generating a
>> >>> >>>>> >>>> release
>> >>> >>>>> >>>> of:
>> >>> >>>>> >>>>
>> >>> >>>>> >>>> 1. Pageviews per second;
>> >>> >>>>> >>>> 2. To enwiki;
>> >>> >>>>> >>>> 3. Over $TIME_PERIOD;
>> >>> >>>>> >>>> 4. grouping the mobile and desktop site
>> >>> >>>>> >>>>
>> >>> >>>>> >>>> But Dario or someone should chip in before I touch anything
>> >>> >>>>> >>>> ;p
>> >>> >>>>> >>>>
>> >>> >>>>> >>>> 6am yesterday. 6am because it should be low-traffic, right?
>> >>> >>>>> >>>> At
>> >>> >>>>> >>>> least
>> >>> >>>>> >>>> given our biases towards north america and europe
>> >>> >>>>> >>>>
>> >>> >>>>> >>>> On 13 April 2015 at 11:54, Oliver Keyes
>> >>> >>>>> >>>> <okeyes@wikimedia.org>
>> >>> >>>>> >>>> wrote:
>> >>> >>>>> >>>>> Then that sounds much more viable. I'll run a quick test
>> >>> >>>>> >>>>> now
>> >>> >>>>> >>>>> to
>> >>> >>>>> >>>>> see
>> >>> >>>>> >>>>> how much clustering we'd see at, say, the one-second
>> >>> >>>>> >>>>> resolution
>> >>> >>>>> >>>>> level,
>> >>> >>>>> >>>>> and throw it out here so we can make more informed
>> >>> >>>>> >>>>> decisions
>> >>> >>>>> >>>>> about a
>> >>> >>>>> >>>>> data release on this.
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> On 13 April 2015 at 08:08, Hirav Gandhi
>> >>> >>>>> >>>>> <hirav.gandhi@gmail.com>
>> >>> >>>>> >>>>> wrote:
>> >>> >>>>> >>>>>> Hi Oliver,
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Re: Hirav: would you be looking for temporally /and/
>> >>> >>>>> >>>>>> contextually
>> >>> >>>>> >>>>>> granular
>> >>> >>>>> >>>>>> pageviews, i.e. "a view to X page at Y time", or just
>> >>> >>>>> >>>>>> temporally
>> >>> >>>>> >>>>>> granular,
>> >>> >>>>> >>>>>> so "a view to a page on enwiki at X time"? If the latter
>> >>> >>>>> >>>>>> you've
>> >>> >>>>> >>>>>> got
>> >>> >>>>> >>>>>> more of
>> >>> >>>>> >>>>>> a shot, I suspect.
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> I only want the latter - I am not concerned with the
>> >>> >>>>> >>>>>> context
>> >>> >>>>> >>>>>> so
>> >>> >>>>> >>>>>> much as
>> >>> >>>>> >>>>>> just
>> >>> >>>>> >>>>>> “a view to a page on enwiki at X time.”
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Hirav
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> On Apr 13, 2015, at 5:00 AM,
>> >>> >>>>> >>>>>> analytics-request@lists.wikimedia.org
>> >>> >>>>> >>>>>> wrote:
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Send Analytics mailing list submissions to
>> >>> >>>>> >>>>>> analytics@lists.wikimedia.org
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> To subscribe or unsubscribe via the World Wide Web, visit
>> >>> >>>>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>>>> or, via email, send a message with subject or body 'help'
>> >>> >>>>> >>>>>> to
>> >>> >>>>> >>>>>> analytics-request@lists.wikimedia.org
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> You can reach the person managing the list at
>> >>> >>>>> >>>>>> analytics-owner@lists.wikimedia.org
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> When replying, please edit your Subject line so it is
>> >>> >>>>> >>>>>> more
>> >>> >>>>> >>>>>> specific
>> >>> >>>>> >>>>>> than "Re: Contents of Analytics digest..."
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Today's Topics:
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> 1. Re: Page views on a more frequent than hourly basis
>> >>> >>>>> >>>>>> (Pine W)
>> >>> >>>>> >>>>>> 2. Re: Page views on a more frequent than hourly basis
>> >>> >>>>> >>>>>> (Oliver
>> >>> >>>>> >>>>>> Keyes)
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> ----------------------------------------------------------------------
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Message: 1
>> >>> >>>>> >>>>>> Date: Mon, 13 Apr 2015 00:47:31 -0700
>> >>> >>>>> >>>>>> From: Pine W <wiki.pine@gmail.com>
>> >>> >>>>> >>>>>> To: "A mailing list for the Analytics Team at WMF and
>> >>> >>>>> >>>>>> everybody
>> >>> >>>>> >>>>>> who
>> >>> >>>>> >>>>>> has an interest in Wikipedia and analytics."
>> >>> >>>>> >>>>>> <analytics@lists.wikimedia.org>
>> >>> >>>>> >>>>>> Cc: Bharath Sitaraman <bharath1028@gmail.com>
>> >>> >>>>> >>>>>> Subject: Re: [Analytics] Page views on a more frequent
>> >>> >>>>> >>>>>> than
>> >>> >>>>> >>>>>> hourly
>> >>> >>>>> >>>>>> basis
>> >>> >>>>> >>>>>> Message-ID:
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> <CAF=dyJgNUT+t6n6muJq16DuYiWP7et6ruHT3_-TZDnseP+29QQ@mail.gmail.com>
>> >>> >>>>> >>>>>> Content-Type: text/plain; charset="utf-8"
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Hi,
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> This issue of pageview data granularity has been
>> >>> >>>>> >>>>>> discussed
>> >>> >>>>> >>>>>> before, and
>> >>> >>>>> >>>>>> the
>> >>> >>>>> >>>>>> answer has been that hourly is the smallest increment
>> >>> >>>>> >>>>>> allowed to
>> >>> >>>>> >>>>>> be
>> >>> >>>>> >>>>>> revealed publicly, for privacy reasons.
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> I believe that the person you will want to discuss your
>> >>> >>>>> >>>>>> request
>> >>> >>>>> >>>>>> with is
>> >>> >>>>> >>>>>> Toby, who I have cc'd here.
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Pine
>> >>> >>>>> >>>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi"
>> >>> >>>>> >>>>>> <hirav.gandhi@gmail.com>
>> >>> >>>>> >>>>>> wrote:
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Hi Wikimedia Analytics Team,
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> My colleague Bharath and I are doing research on dynamic
>> >>> >>>>> >>>>>> server
>> >>> >>>>> >>>>>> allocation
>> >>> >>>>> >>>>>> algorithms and we were looking for a suitable datasets to
>> >>> >>>>> >>>>>> test
>> >>> >>>>> >>>>>> our
>> >>> >>>>> >>>>>> predictive algorithm on. We noticed that Wikimedia has an
>> >>> >>>>> >>>>>> amazing data
>> >>> >>>>> >>>>>> set
>> >>> >>>>> >>>>>> of hourly page views, but we were looking for something a
>> >>> >>>>> >>>>>> bit
>> >>> >>>>> >>>>>> more
>> >>> >>>>> >>>>>> granular, such as aggregated page requests to English
>> >>> >>>>> >>>>>> Wikipedia
>> >>> >>>>> >>>>>> on a
>> >>> >>>>> >>>>>> minute
>> >>> >>>>> >>>>>> by minute basis or second by second basis if possible.
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> We are more than happy to pour through any raw data you
>> >>> >>>>> >>>>>> might
>> >>> >>>>> >>>>>> have that
>> >>> >>>>> >>>>>> would help us calculate page requests at this granular
>> >>> >>>>> >>>>>> level.
>> >>> >>>>> >>>>>> Please
>> >>> >>>>> >>>>>> let us
>> >>> >>>>> >>>>>> know if it would be possible to get such data and if so
>> >>> >>>>> >>>>>> how.
>> >>> >>>>> >>>>>> Thank you
>> >>> >>>>> >>>>>> in
>> >>> >>>>> >>>>>> advance for your help.
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Best,
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Hirav Gandhi
>> >>> >>>>> >>>>>> _______________________________________________
>> >>> >>>>> >>>>>> Analytics mailing list
>> >>> >>>>> >>>>>> Analytics@lists.wikimedia.org
>> >>> >>>>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> -------------- next part --------------
>> >>> >>>>> >>>>>> An HTML attachment was scrubbed...
>> >>> >>>>> >>>>>> URL:
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> <https://lists.wikimedia.org/pipermail/analytics/attachments/20150413/a88287b6/attachment-0001.html>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> ------------------------------
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Message: 2
>> >>> >>>>> >>>>>> Date: Mon, 13 Apr 2015 06:39:45 -0400
>> >>> >>>>> >>>>>> From: Oliver Keyes <okeyes@wikimedia.org>
>> >>> >>>>> >>>>>> To: "A mailing list for the Analytics Team at WMF and
>> >>> >>>>> >>>>>> everybody
>> >>> >>>>> >>>>>> who
>> >>> >>>>> >>>>>> has an interest in Wikipedia and analytics."
>> >>> >>>>> >>>>>> <analytics@lists.wikimedia.org>
>> >>> >>>>> >>>>>> Cc: Bharath Sitaraman <bharath1028@gmail.com>
>> >>> >>>>> >>>>>> Subject: Re: [Analytics] Page views on a more frequent
>> >>> >>>>> >>>>>> than
>> >>> >>>>> >>>>>> hourly
>> >>> >>>>> >>>>>> basis
>> >>> >>>>> >>>>>> Message-ID:
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> <CAAUQgdDsnHd8s+ACL-XBtXBz6OO-T04CcJfnGfqwrYAV-=hxPg@mail.gmail.com>
>> >>> >>>>> >>>>>> Content-Type: text/plain; charset=UTF-8
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Preeetty sure that Toby is on the analytics list, Pine.
>> >>> >>>>> >>>>>> He's
>> >>> >>>>> >>>>>> the
>> >>> >>>>> >>>>>> director of analytics.
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Hirav: would you be looking for temporally /and/
>> >>> >>>>> >>>>>> contextually
>> >>> >>>>> >>>>>> granular
>> >>> >>>>> >>>>>> pageviews, i.e. "a view to X page at Y time", or just
>> >>> >>>>> >>>>>> temporally
>> >>> >>>>> >>>>>> granular, so "a view to a page on enwiki at X time"? If
>> >>> >>>>> >>>>>> the
>> >>> >>>>> >>>>>> latter
>> >>> >>>>> >>>>>> you've got more of a shot, I suspect.
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> On 13 April 2015 at 03:47, Pine W <wiki.pine@gmail.com>
>> >>> >>>>> >>>>>> wrote:
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Hi,
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> This issue of pageview data granularity has been
>> >>> >>>>> >>>>>> discussed
>> >>> >>>>> >>>>>> before, and
>> >>> >>>>> >>>>>> the
>> >>> >>>>> >>>>>> answer has been that hourly is the smallest increment
>> >>> >>>>> >>>>>> allowed to
>> >>> >>>>> >>>>>> be
>> >>> >>>>> >>>>>> revealed
>> >>> >>>>> >>>>>> publicly, for privacy reasons.
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> I believe that the person you will want to discuss your
>> >>> >>>>> >>>>>> request
>> >>> >>>>> >>>>>> with is
>> >>> >>>>> >>>>>> Toby, who I have cc'd here.
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Pine
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> On Apr 13, 2015 12:11 AM, "Hirav Gandhi"
>> >>> >>>>> >>>>>> <hirav.gandhi@gmail.com>
>> >>> >>>>> >>>>>> wrote:
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Hi Wikimedia Analytics Team,
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> My colleague Bharath and I are doing research on dynamic
>> >>> >>>>> >>>>>> server
>> >>> >>>>> >>>>>> allocation
>> >>> >>>>> >>>>>> algorithms and we were looking for a suitable datasets to
>> >>> >>>>> >>>>>> test
>> >>> >>>>> >>>>>> our
>> >>> >>>>> >>>>>> predictive algorithm on. We noticed that Wikimedia has an
>> >>> >>>>> >>>>>> amazing data
>> >>> >>>>> >>>>>> set
>> >>> >>>>> >>>>>> of hourly page views, but we were looking for something a
>> >>> >>>>> >>>>>> bit
>> >>> >>>>> >>>>>> more
>> >>> >>>>> >>>>>> granular,
>> >>> >>>>> >>>>>> such as aggregated page requests to English Wikipedia on
>> >>> >>>>> >>>>>> a
>> >>> >>>>> >>>>>> minute by
>> >>> >>>>> >>>>>> minute
>> >>> >>>>> >>>>>> basis or second by second basis if possible.
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> We are more than happy to pour through any raw data you
>> >>> >>>>> >>>>>> might
>> >>> >>>>> >>>>>> have that
>> >>> >>>>> >>>>>> would help us calculate page requests at this granular
>> >>> >>>>> >>>>>> level.
>> >>> >>>>> >>>>>> Please
>> >>> >>>>> >>>>>> let us
>> >>> >>>>> >>>>>> know if it would be possible to get such data and if so
>> >>> >>>>> >>>>>> how.
>> >>> >>>>> >>>>>> Thank you
>> >>> >>>>> >>>>>> in
>> >>> >>>>> >>>>>> advance for your help.
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Best,
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> Hirav Gandhi
>> >>> >>>>> >>>>>> _______________________________________________
>> >>> >>>>> >>>>>> Analytics mailing list
>> >>> >>>>> >>>>>> Analytics@lists.wikimedia.org
>> >>> >>>>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> _______________________________________________
>> >>> >>>>> >>>>>> Analytics mailing list
>> >>> >>>>> >>>>>> Analytics@lists.wikimedia.org
>> >>> >>>>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> --
>> >>> >>>>> >>>>>> Oliver Keyes
>> >>> >>>>> >>>>>> Research Analyst
>> >>> >>>>> >>>>>> Wikimedia Foundation
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> ------------------------------
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> _______________________________________________
>> >>> >>>>> >>>>>> Analytics mailing list
>> >>> >>>>> >>>>>> Analytics@lists.wikimedia.org
>> >>> >>>>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> End of Analytics Digest, Vol 38, Issue 21
>> >>> >>>>> >>>>>> *****************************************
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>> _______________________________________________
>> >>> >>>>> >>>>>> Analytics mailing list
>> >>> >>>>> >>>>>> Analytics@lists.wikimedia.org
>> >>> >>>>> >>>>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>>
>> >>> >>>>> >>>>> --
>> >>> >>>>> >>>>> Oliver Keyes
>> >>> >>>>> >>>>> Research Analyst
>> >>> >>>>> >>>>> Wikimedia Foundation
>> >>> >>>>> >>>>
>> >>> >>>>> >>>>
>> >>> >>>>> >>>>
>> >>> >>>>> >>>> --
>> >>> >>>>> >>>> Oliver Keyes
>> >>> >>>>> >>>> Research Analyst
>> >>> >>>>> >>>> Wikimedia Foundation
>> >>> >>>>> >>>>
>> >>> >>>>> >>>>
>> >>> >>>>> >>>>
>> >>> >>>>> >>>> ------------------------------
>> >>> >>>>> >>>>
>> >>> >>>>> >>>> _______________________________________________
>> >>> >>>>> >>>> Analytics mailing list
>> >>> >>>>> >>>> Analytics@lists.wikimedia.org
>> >>> >>>>> >>>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>
>> >>> >>>>> >>>
>> >>> >>>>> >>> _______________________________________________
>> >>> >>>>> >>> Analytics mailing list
>> >>> >>>>> >>> Analytics@lists.wikimedia.org
>> >>> >>>>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>>
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> --
>> >>> >>>>> >> Oliver Keyes
>> >>> >>>>> >> Research Analyst
>> >>> >>>>> >> Wikimedia Foundation
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> ------------------------------
>> >>> >>>>> >>
>> >>> >>>>> >> _______________________________________________
>> >>> >>>>> >> Analytics mailing list
>> >>> >>>>> >> Analytics@lists.wikimedia.org
>> >>> >>>>> >> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>> >>
>> >>> >>>>> >>
>> >>> >>>>> >> End of Analytics Digest, Vol 38, Issue 24
>> >>> >>>>> >> *****************************************
>> >>> >>>>> >
>> >>> >>>>> >
>> >>> >>>>> > _______________________________________________
>> >>> >>>>> > Analytics mailing list
>> >>> >>>>> > Analytics@lists.wikimedia.org
>> >>> >>>>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>> >>>>>
>> >>> >>>>>
>> >>> >>>>>
>> >>> >>>>> --
>> >>> >>>>> Oliver Keyes
>> >>> >>>>> Research Analyst
>> >>> >>>>> Wikimedia Foundation
>> >>> >>>>
>> >>> >>>>
>> >>> >>>>
>> >>> >>>>
>> >>> >>>> --
>> >>> >>>> Dario Taraborelli
>> >>> >>>> Senior Research Scientist, Research and Data Lead
>> >>> >>>> Wikimedia Foundation
>> >>> >>>> http://wikimediafoundation.org
>> >>> >>>> http://nitens.org/taraborelli
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>>
>> >>> >>> --
>> >>> >>> Dario Taraborelli
>> >>> >>> Senior Research Scientist, Research and Data Lead
>> >>> >>> Wikimedia Foundation
>> >>> >>> http://wikimediafoundation.org
>> >>> >>> http://nitens.org/taraborelli
>> >>> >>
>> >>> >>
>> >>> >
>> >>>
>> >>>
>> >>>
>> >>> --
>> >>> Oliver Keyes
>> >>> Research Analyst
>> >>> Wikimedia Foundation
>> >>>
>> >>> _______________________________________________
>> >>> Analytics mailing list
>> >>> Analytics@lists.wikimedia.org
>> >>> https://lists.wikimedia.org/mailman/listinfo/analytics
>> >>
>> >>
>> >>
>> >>
>> >> --
>> >> Dario Taraborelli
>> >> Senior Research Scientist, Research and Data Lead
>> >> Wikimedia Foundation
>> >> http://wikimediafoundation.org
>> >> http://nitens.org/taraborelli
>> >
>> >
>> >
>> > _______________________________________________
>> > Analytics mailing list
>> > Analytics@lists.wikimedia.org
>> > https://lists.wikimedia.org/mailman/listinfo/analytics
>> >
>>
>>
>>
>> --
>> Oliver Keyes
>> Research Analyst
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Analytics mailing list
>> Analytics@lists.wikimedia.org
>> https://lists.wikimedia.org/mailman/listinfo/analytics
>
>
>
> _______________________________________________
> Analytics mailing list
> Analytics@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/analytics
>



--
Oliver Keyes
Research Analyst
Wikimedia Foundation



--
Bharath Sitaraman
bharath@cs.stanford.edu



--
Bharath Sitaraman
bharath@cs.stanford.edu