Hi Simon, I copy the analytics mailing list to this message, as this is best way to get answers to your requests or data or technical aspects of tha analytics systems.
The dataset you ask for contains data that we don't provide without NDAs. To be precise, we don't disclose precisely timestamped hits publicly, trying to prevent easily reconstructible sessions. Now the easiest way for you to get your hands on that data would be to set up a formal collaboration with WMF, involving a NDA. I'm not an expert in how to do that, you might be willing to contact the research team (wiki-research-l@lists.wikimedia.org), and read more here: https://www.mediawiki.org/wiki/Wikimedia_Research/Formal_collaborations. Best Joseph
On Fri, Jan 26, 2018 at 1:55 PM, Jianyun Sun simonjoylet@gmail.com wrote:
Hi joal,
I'm a student from Southeast University and now I'm on a research about better scheduling of web request. For experiment, I need the data of web request of wikimedia, especially page request records with timestamps and response size. Only a month-long data is enough. Would you please send me a copy or help me get an access ticket on Hive so I can get it by myself?
I'm looking forward to your reply. Thank you sincerely!
Simon 2018.01.26
Update: I've read this thread and am in contact with the researcher on a separate thread.
Best, Leila
-- Leila Zia Senior Research Scientist Wikimedia Foundation
On Mon, Jan 29, 2018 at 12:20 AM, Joseph Allemandou jallemandou@wikimedia.org wrote:
Hi Simon, I copy the analytics mailing list to this message, as this is best way to get answers to your requests or data or technical aspects of tha analytics systems.
The dataset you ask for contains data that we don't provide without NDAs. To be precise, we don't disclose precisely timestamped hits publicly, trying to prevent easily reconstructible sessions. Now the easiest way for you to get your hands on that data would be to set up a formal collaboration with WMF, involving a NDA. I'm not an expert in how to do that, you might be willing to contact the research team (wiki-research-l@lists.wikimedia.org), and read more here: https://www.mediawiki.org/wiki/Wikimedia_Research/Formal_collaborations. Best Joseph
On Fri, Jan 26, 2018 at 1:55 PM, Jianyun Sun simonjoylet@gmail.com wrote:
Hi joal,
I'm a student from Southeast University and now I'm on a research about better scheduling of web request. For experiment, I need the data of web request of wikimedia, especially page request records with timestamps and response size. Only a month-long data is enough. Would you please send me a copy or help me get an access ticket on Hive so I can get it by myself?
I'm looking forward to your reply. Thank you sincerely!
Simon 2018.01.26
-- Joseph Allemandou Data Engineer @ Wikimedia Foundation IRC: joal
-- Joseph Allemandou Data Engineer @ Wikimedia Foundation IRC: joal
Analytics mailing list Analytics@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/analytics