I second Aaron’s concerns, which I previously expressed during the consultation about the new privacy policy. My main objection to the proposed solution is that by saying “Wikimedia honors DNT headers” we imply – by the most popular/de facto interpretation of DNT – that we do 3rd party tracking but we allow users to opt out, which puts WMF on par with aggressive tracking practices adopted by most sites. 

I’d rather focus on a clean and transparent implementation of an opt out mechanism that doesn’t create confusion, gives the user a clear understanding of what s/he is opting out from instead of piggybacking on DNT.

I too am worried of the impact of the exclusion of a segment of the user population from (aggregate) measurements that we obtain via instrumentation and that we use to assess the impact of Product changes, but I’m ready to push the discussion of what is an acceptable tradeoff to our customers (the community and decision-makers at WMF). It’s also worth reminding that all data collected via EventLogging that contains PII such as IP addresses or raw UserAgents is subject to our data retention guidelines. [1]


[1] https://meta.wikimedia.org/wiki/Data_retention_guidelines

On Jan 16, 2015, at 1:29 PM, Aaron Halfaker <ahalfaker@wikimedia.org> wrote:


I agree on all points.  My assertions are this:
  1. DNT means 3rd party tracking.  It's in the definition.  
  2. However, we'd like to have a strict interpretation and act beyond the definition.  This empowers our users and sets a good precedent. 
  3. The categorical exclusion of a substantial set of our users from field studies is concerning and can cause problem.
Though Nuria pointed out that DNT/IE10 is not the only potential categorical exclusion, that does not reduce the problem.  If we can can confirm that this won't cause a substantial issue or implement a strategy to make sure it does not, then this won't be a problem.


On Fri, Jan 16, 2015 at 1:42 PM, Ori Livneh <ori@wikimedia.org> wrote:

On Thu, Jan 15, 2015 at 9:55 PM, Aaron Halfaker <ahalfaker@wikimedia.org> wrote:

What I find concerning is the idea that a biased subset of our users would be categorically ignored for this type of evaluation.  If you agree with me that such evaluation is valuable to our users, I think you ought to also find such categorical exclusions concerning

(In the e-mail below I sometimes use "we" to mean "Wikimedians" and sometimes to mean "Wikimedia Foundation employees". I am aware that this is a public discussion and that not all participants are employees of the Foundation. Hopefully the context will make my meaning clear.)

Aaron's point is valid. If we collect any data at all, we are morally obligated to do so in a way that can actually support rigorous research on questions of broad value to the community and humanity as a whole. Collecting data in a manner that we know cannot support serious research is morally obnoxious and it invalidates the mandate we claim to collect any data at all.

That said, I am not convinced that adopting a strong interpretation of DNT (and acting on it) would substantially compromise our ability to do research. The bias that it potentially introduces is of comparable magnitude to the risks of bias that scientists routinely accept in the interest of meeting ethical standards and respecting the rights of individuals. The fact that participation in drug trials is voluntary and that the compensation (when there is any) is usually fixed at a set amount is a good example.

I also think that our ability to conduct research would be compromised far more substantially were we to lose the confidence of our users. The only hope we have of gaining an understanding of Wikimedia is (in my opinion) through peer collaboration with our community. The question of whether we (Foundation employees) will be able to support a broad community of inquiry has much higher stakes than whether or not our data is fully representative of all user-agents.

The fact that there is no strong legal requirement forcing our hand here and that weaker interpretations of the header are defensible and plausible means that there is an opportunity here to be lead by example and to send a strong message to our community and to the internet at large about our values and our commitment to our users. It's an opportunity I think we should take.

Analytics mailing list

Analytics mailing list