How would you anonymize data? This is very difficult. If a user is pseudonomized with a random identifier it is not difficult to triangularize the user. This is particular the case if the user is a Wikipedian: The user will often read his/her own user talk page and the pages s/he edits.
Readings:
https://en.wikipedia.org/wiki/AOL_search_data_leak
https://en.wikipedia.org/wiki/Differential_privacy#Netflix_Prize
best regards Finn Årup Nielsen
Den 29-12-2014 kl. 04:53 skrev Ditty Mathew:
The exact user information is not needed. The anonymized data is enough. What exactly we need is the navigation path of Wikipedia readers.
with regards
Ditty
On Sun, Dec 28, 2014 at 9:46 PM, Oliver Keyes <okeyes@wikimedia.org mailto:okeyes@wikimedia.org> wrote:
Afraid not. First, we do not have some of those datapoints; we do not currently have unique user IDs. And, second, it would be a tremendous ethical violation for us to release that data that we /do/ have (IP addresses, for example). On 28 December 2014 at 21:00, Ditty Mathew <dittyvkm@gmail.com <mailto:dittyvkm@gmail.com>> wrote: Hi, Is the reader's click log data(should contain user id/ip, article title, timestamp) is available for Wikipedia. with regards Ditty _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org <mailto:Wiki-research-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l -- Oliver Keyes Research Analyst Wikimedia Foundation _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org <mailto:Wiki-research-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
wiki-research-l@lists.wikimedia.org