Hi Pine --

Thanks for this -- it's a challenging topic but one that the Analytics team takes very seriously.

I'm not familiar with the IP address review that's referenced in the link. I don't know who the staffer might be. We don't currently calculate unique visitors to anything in Analytics and IP address is not a particularly accurate way to assess unique visitors regardless (due to proxies/NATs/etc).

We do store IPs as part of page requests in our raw logs which are deleted every 30 days. This data is kept on a system where access is limited and controlled by the operations team. We're in line with the privacy policy on this.

To be clear, we are currently considering mechanisms to count unique "requests" -- we rely on Comscore for this data and for several reasons, primarily related to mobile usage, it's not sufficient to understand our usage patterns. We are putting together some proposals to do this in as limited way as possible and that's respectful to our users. We'll share this with the community when we feel we understand the use cases and trade-offs well enough to discuss in an informed manner.

-Toby



We do store the IP address associated with varnish requests as part of the log. This data is  



On Thu, Oct 16, 2014 at 8:50 PM, Pine W <wiki.pine@gmail.com> wrote:
Hi again Analytics,

I was under the impression that no records are kept of which IPs access which articles on Wikipedia when no edits are made, but it appears that such records are in fact kept [1].

Is this proper? This practice appears to be permissible under the Privacy Policy which states that "We use IP addresses for research and analytics; to better personalize content, notices, and settings for you; to fight spam, identity theft, malware, and other kinds of abuse; and to provide better mobile and other applications." 

It is possible that this information is relevant for determining the number of unique visitors that Wikipedia gets and that this information is always properly filtered before it gets to the Signpost. However, given recent discussions which I thought said that Wikipedia was not instrumented to track unique visitors, I am surprised to learn that this already seems to be happening and that the situation has been this way for some time, so I would appreciate clarification.

I want to emphasize that this question is about clarifying the practice of tracking likely unique visitors by IP. This question is not intended to start flame wars, get people into trouble, or limit the Signpost's access to properly filtered information if there has been a determination that WMF's retention of the raw data is appropriate. There might be appropriate secondary questions about making sure that access to the raw IP access data is carefully contained and secured.

Thank you very much,

Pine

[1] https://en.wikipedia.org/w/index.php?title=User_talk%3ASerendipodous&diff=629934257&oldid=629932288


_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics