Hi Simon,
I copy the analytics mailing list to this message, as this is best way to
get answers to your requests or data or technical aspects of tha analytics
systems.
The dataset you ask for contains data that we don't provide without NDAs.
To be precise, we don't disclose precisely timestamped hits publicly,
trying to prevent easily reconstructible sessions.
Now the easiest way for you to get your hands on that data would be to set
up a formal collaboration with WMF, involving a NDA.
I'm not an expert in how to do that, you might be willing to contact the
research team (wiki-research-l(a)lists.wikimedia.org), and read more here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Formal_collaborations.
Best
Joseph
On Fri, Jan 26, 2018 at 1:55 PM, Jianyun Sun <simonjoylet(a)gmail.com> wrote:
Hi joal,
I'm a student from Southeast University and now I'm on a research about
better scheduling of web request.
For experiment, I need the data of web request of wikimedia, especially
page request records with timestamps and response size. Only a month-long
data is enough. Would you please send me a copy or help me get an access
ticket on Hive so I can get it by myself?
I'm looking forward to your reply. Thank you sincerely!
Simon
2018.01.26
--
*Joseph Allemandou*
Data Engineer @ Wikimedia Foundation
IRC: joal
--
*Joseph Allemandou*
Data Engineer @ Wikimedia Foundation
IRC: joal