FYI to wikitech-l - reply to MZMcBride. See http://lists.wikimedia.org/pipermail/wikimedia-l/2014-April/071131.html to follow or contribute to the thread on wikimedia-l if you're not subscribed there already.
-Adam
---------- Forwarded message ---------- From: Adam Baso abaso@wikimedia.org Date: Wed, Apr 16, 2014 at 11:16 AM Subject: Re: [Wikimedia-l] Mobile Operator IP Drift Tracking and Remediation To: Wikimedia Mailing List wikimedia-l@lists.wikimedia.org
Inline.
Thanks for starting this thread.
Sorry if I've overlooked this, but who/what will have access to this data? Only members of the mobile team? Local project CheckUsers? Wikimedia Foundation-approved researchers? Wikimedia shell users? AbuseFilter filters?
It's a good question. The thought is to put it in the customary wfDebugLog location (with, for example, filename "mccmnc.log") on fluorine.
It just occurred to me that the wiki name (e.g., "enwiki"), but not the full URL, gets logged additionally as part of the wfDebugLog call; to make the implicit explicit, wfDebugLog adds a datetime stamp as well, and that's useful for purging old records. I'll forward this email to mobile-l and wikitech-l to underscore this.
And this may be a silly question, but is there a reasonable means of approximating how identifying these two data points alone are? That is, Using a mobile country code and exit IP address, is it possible to identify a particular editor or reader? Or perhaps rephrased, is this data considered anonymized?
Not a silly question. My approximation is these tuples (datetime, now that it hit me - XYwiki, exit IP, and MCC-MNC) alone, although not perfectly anonymized, are low identifying (that is, indirect inferences on the data in isolation are unlikely, but technically possible, through examination of short tail outliers in a cluster analysis where such readers/editors exist in the short tail outliers sets), in contrast to regular web access logs (where direct inferences are easy).
Thanks. I'll forward this along now.
-Adam
wikitech-l@lists.wikimedia.org