>I support any decrease of the storage of plain IP addresses. See also <https://www.mediawiki.org/wiki/Thread:Talk:Requests_for_comment/Structured_logging/IP_address_and_other_personal_identifying_information> for more references.

To be clear: on our end we need buffer time that allows us to know that should there be a bug we can reprocess pageviews if needed (this does happen). That buffer time is now 60 days and perhaps it could be a bit smaller but it is still going to be a matter of weeks, not days for which the raw data needs to be available. As mentione earlier in the thread we need raw IPs to geolocate requests, once that is done IPs are discarded. 



On Fri, Nov 11, 2016 at 12:00 AM, Federico Leva (Nemo) <nemowiki@gmail.com> wrote:
Dan Andreescu, 10/11/2016 16:00:
I don't have as clear a reason for why we store the plain IP in
webrequest.  I think we could count uniques and all that other stuff
with the IP hash.  It's a good question, tentative +1 unless I'm
forgetting something.

I support any decrease of the storage of plain IP addresses. See also <https://www.mediawiki.org/wiki/Thread:Talk:Requests_for_comment/Structured_logging/IP_address_and_other_personal_identifying_information> for more references.

Nemo


_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics