This is not our intention for the long term, we are in
the middle
of putting in place a sanitization strategy to get rid of any PII after
90 days.
This discussion might make more sense in another
thread though,
kindly please do not hijack Sajjad's thread :)
The results of our internal discussion regarding sanitization go in this
regard so far:
For users that have ethical concerns about their data being gathered via
EventLogging we have thought we could provide an incognito mode. Incognito
mode will be "on" by default if you browse with cookies off. That is, if
your browser is set to not make use of cookies, no data will be sampled.
This is so far just an idea.
Regarding anonymization: after much discussion we believe that to properly
anonymize EventLogging data there is no other solution than aggregation and
for that we need to build infrastructure that will "consume" EventLogging
events. At this time EventLogging just samples discrete events thus data is
stored as "discrete" data points. That being said, IPs are always
anonymized in any EventLogging dataset. Not so User Agents.
We shall be updating this wiki in the near future with more information:
https://www.mediawiki.org/wiki/EventLogging/UserAgentSanitization
On Thu, Mar 13, 2014 at 12:43 PM, Dan Andreescu <dandreescu(a)wikimedia.org>wrote;wrote:
On Thu, Mar 13, 2014 at 9:32 AM, Federico Leva (Nemo)
<nemowiki(a)gmail.com>wrote;wrote:
Andrew Gray, 13/03/2014 00:56:
For that matter, surely this data won't exist anyway before 2013 or so?
I'm not sure how long we retain IP data for
logged-in users, but I'd be
a bit startled if it was five years.
EventLogging can contain almost anything I think. Is there any purging?
I don't think so. Is it aggregate and anonymised? No longer. <
https://www.mediawiki.org/w/index.php?title=Extension:
EventLogging&diff=prev&oldid=905171>
On Thu, Mar 13, 2014 at 5:19 AM, Nuria Ruiz <nuria(a)wikimedia.org> wrote:
Sorry but this is not correct:
IP addresses are anonymized in Event Logging and
they always have been
so. We calculate a HMAC with a rotating salt that changes either every 90
days or with a service restart.
Event Logging data has never been aggregated, it is a system to log
discrete events. There had not been any changes on this regard as of late.
What Nuria said is correct, however, we do store some data, such as User
Agents currently. This is not our intention for the long term, we are in
the middle of putting in place a sanitization strategy to get rid of any
PII after 90 days. This discussion might make more sense in another thread
though, kindly please do not hijack Sajjad's thread :)
_______________________________________________
Analytics mailing list
Analytics(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics