I'm not exactly sure how one provides an anonymised dataset that contains
IP addresses. But:
We don't have those navigation paths and so can't provide them. Sure, we
could provide the {referer, URL} tuples associated with specific IP
addresses, and replace the IP with some kind of randomly-generated value
(or just a salted hash) but this falls apart very quickly with the modern
structure of the internet and the scale Wikimedia properties operate on:
you can have a lot of distinct people at one IP address, particularly
through cellular networks, and so multiple sessions and trails can get
inaccurately grouped together. More importantly, the HTTPS protocol
involves either sanitising or completely stripping referers, rendering
those chains impossible to reconstruct.
I believe Leila Zia and Bob West (who will hopefully see this message. I
know Leila is on this list!) are currently working on a project that looks
at search paths, and they may have additional commentary. But
generally-speaking: we do not generate this data as a matter of course, we
would not be comfortable releasing it (unless exceedingly sanitised), and
as the person who deals with our request logs on a day-to-day basis I can
think of a half-dozen ways in which it would produce false results (ways we
have no real way of checking the probability of occurring).
On 28 December 2014 at 22:53, Ditty Mathew <dittyvkm(a)gmail.com> wrote:
The exact user information is not needed. The
anonymized data is enough.
What exactly we need is the navigation path of Wikipedia readers.
with regards
Ditty
On Sun, Dec 28, 2014 at 9:46 PM, Oliver Keyes <okeyes(a)wikimedia.org>
wrote:
Afraid not. First, we do not have some of those
datapoints; we do not
currently have unique user IDs. And, second, it would be a tremendous
ethical violation for us to release that data that we /do/ have (IP
addresses, for example).
On 28 December 2014 at 21:00, Ditty Mathew <dittyvkm(a)gmail.com> wrote:
Hi,
Is the reader's click log data(should contain user id/ip, article title,
timestamp) is available for Wikipedia.
with regards
Ditty
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
--
Oliver Keyes
Research Analyst
Wikimedia Foundation
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
--
Oliver Keyes
Research Analyst
Wikimedia Foundation