I second Aaron’s concerns, which I previously expressed during the consultation about the
“Wikimedia honors DNT headers” we imply – by the most popular/de facto interpretation of
DNT – that we do 3rd party tracking but we allow users to opt out, which puts WMF on par
with aggressive tracking practices adopted by most sites.
I’d rather focus on a clean and transparent implementation of an opt out mechanism that
doesn’t create confusion, gives the user a clear understanding of what s/he is opting out
from instead of piggybacking on DNT.
I too am worried of the impact of the exclusion of a segment of the user population from
(aggregate) measurements that we obtain via instrumentation and that we use to assess the
impact of Product changes, but I’m ready to push the discussion of what is an acceptable
tradeoff to our customers (the community and decision-makers at WMF). It’s also worth
reminding that all data collected via EventLogging that contains PII such as IP addresses
or raw UserAgents is subject to our data retention guidelines. 
On Jan 16, 2015, at 1:29 PM, Aaron Halfaker
I agree on all points. My assertions are this:
DNT means 3rd party tracking. It's in the definition.
However, we'd like to have a strict interpretation and act beyond the definition.
This empowers our users and sets a good precedent.
The categorical exclusion of a substantial set of our users from field studies is
concerning and can cause problem.
Though Nuria pointed out that DNT/IE10 is not the only potential categorical exclusion,
that does not reduce the problem. If we can can confirm that this won't cause a
substantial issue or implement a strategy to make sure it does not, then this won't be
On Fri, Jan 16, 2015 at 1:42 PM, Ori Livneh <ori(a)wikimedia.org
On Thu, Jan 15, 2015 at 9:55 PM, Aaron Halfaker <ahalfaker(a)wikimedia.org
What I find concerning is the idea that a biased subset of our users would be
categorically ignored for this type of evaluation. If you agree with me that such
evaluation is valuable to our users, I think you ought to also find such categorical
(In the e-mail below I sometimes use "we" to mean "Wikimedians" and
sometimes to mean "Wikimedia Foundation employees". I am aware that this is a
public discussion and that not all participants are employees of the Foundation. Hopefully
the context will make my meaning clear.)
Aaron's point is valid. If we collect any data at all, we are morally obligated to do
so in a way that can actually support rigorous research on questions of broad value to the
community and humanity as a whole. Collecting data in a manner that we know cannot support
serious research is morally obnoxious and it invalidates the mandate we claim to collect
any data at all.
That said, I am not convinced that adopting a strong interpretation of DNT (and acting on
it) would substantially compromise our ability to do research. The bias that it
potentially introduces is of comparable magnitude to the risks of bias that scientists
routinely accept in the interest of meeting ethical standards and respecting the rights of
individuals. The fact that participation in drug trials is voluntary and that the
compensation (when there is any) is usually fixed at a set amount is a good example.
I also think that our ability to conduct research would be compromised far more
substantially were we to lose the confidence of our users. The only hope we have of
gaining an understanding of Wikimedia is (in my opinion) through peer collaboration with
our community. The question of whether we (Foundation employees) will be able to support a
broad community of inquiry has much higher stakes than whether or not our data is fully
representative of all user-agents.
The fact that there is no strong legal requirement forcing our hand here and that weaker
interpretations of the header are defensible and plausible means that there is an
opportunity here to be lead by example and to send a strong message to our community and
to the internet at large about our values and our commitment to our users. It's an
opportunity I think we should take.
Analytics mailing list
Analytics mailing list