Hi Wikimedia Analytics Team,
My colleague Bharath and I are doing research on dynamic server allocation algorithms and we were looking for a suitable datasets to test our predictive algorithm on. We noticed that Wikimedia has an amazing data set of hourly page views, but we were looking for something a bit more granular, such as aggregated page requests to English Wikipedia on a minute by minute basis or second by second basis if possible.
We are more than happy to pour through any raw data you might have that would help us calculate page requests at this granular level. Please let us know if it would be possible to get such data and if so how. Thank you in advance for your help.
Best,
Hirav Gandhi
Hi,
This issue of pageview data granularity has been discussed before, and the answer has been that hourly is the smallest increment allowed to be revealed publicly, for privacy reasons.
I believe that the person you will want to discuss your request with is Toby, who I have cc'd here.
Pine On Apr 13, 2015 12:11 AM, "Hirav Gandhi" hirav.gandhi@gmail.com wrote:
Hi Wikimedia Analytics Team,
My colleague Bharath and I are doing research on dynamic server allocation algorithms and we were looking for a suitable datasets to test our predictive algorithm on. We noticed that Wikimedia has an amazing data set of hourly page views, but we were looking for something a bit more granular, such as aggregated page requests to English Wikipedia on a minute by minute basis or second by second basis if possible.
We are more than happy to pour through any raw data you might have that would help us calculate page requests at this granular level. Please let us know if it would be possible to get such data and if so how. Thank you in advance for your help.
Best,
Hirav Gandhi _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Preeetty sure that Toby is on the analytics list, Pine. He's the director of analytics.
Hirav: would you be looking for temporally /and/ contextually granular pageviews, i.e. "a view to X page at Y time", or just temporally granular, so "a view to a page on enwiki at X time"? If the latter you've got more of a shot, I suspect.
On 13 April 2015 at 03:47, Pine W wiki.pine@gmail.com wrote:
Hi,
This issue of pageview data granularity has been discussed before, and the answer has been that hourly is the smallest increment allowed to be revealed publicly, for privacy reasons.
I believe that the person you will want to discuss your request with is Toby, who I have cc'd here.
Pine
On Apr 13, 2015 12:11 AM, "Hirav Gandhi" hirav.gandhi@gmail.com wrote:
Hi Wikimedia Analytics Team,
My colleague Bharath and I are doing research on dynamic server allocation algorithms and we were looking for a suitable datasets to test our predictive algorithm on. We noticed that Wikimedia has an amazing data set of hourly page views, but we were looking for something a bit more granular, such as aggregated page requests to English Wikipedia on a minute by minute basis or second by second basis if possible.
We are more than happy to pour through any raw data you might have that would help us calculate page requests at this granular level. Please let us know if it would be possible to get such data and if so how. Thank you in advance for your help.
Best,
Hirav Gandhi _______________________________________________ Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics
Hi Oliver, re ccing people who are on list, this is the protocol we followed in IEGCom to ping people who are subscribed and mentioned in certain emails but, like many of us, may automatically move emails from lists directly to folders where they may be unread for days. So there is a reason to do this.
Thanks,
Pine