On Thu, Jan 15, 2015 at 9:55 PM, Aaron Halfaker <ahalfaker@wikimedia.org> wrote:

What I find concerning is the idea that a biased subset of our users would be categorically ignored for this type of evaluation. If you agree with me that such evaluation is valuable to our users, I think you ought to also find such categorical exclusions concerning

(In the e-mail below I sometimes use "we" to mean "Wikimedians" and sometimes to mean "Wikimedia Foundation employees". I am aware that this is a public discussion and that not all participants are employees of the Foundation. Hopefully the context will make my meaning clear.)

Aaron's point is valid. If we collect any data at all, we are morally obligated to do so in a way that can actually support rigorous research on questions of broad value to the community and humanity as a whole. Collecting data in a manner that we know cannot support serious research is morally obnoxious and it invalidates the mandate we claim to collect any data at all.

That said, I am not convinced that adopting a strong interpretation of DNT (and acting on it) would substantially compromise our ability to do research. The bias that it potentially introduces is of comparable magnitude to the risks of bias that scientists routinely accept in the interest of meeting ethical standards and respecting the rights of individuals. The fact that participation in drug trials is voluntary and that the compensation (when there is any) is usually fixed at a set amount is a good example.

I also think that our ability to conduct research would be compromised far more substantially were we to lose the confidence of our users. The only hope we have of gaining an understanding of Wikimedia is (in my opinion) through peer collaboration with our community. The question of whether we (Foundation employees) will be able to support a broad community of inquiry has much higher stakes than whether or not our data is fully representative of all user-agents.

The fact that there is no strong legal requirement forcing our hand here and that weaker interpretations of the header are defensible and plausible means that there is an opportunity here to be lead by example and to send a strong message to our community and to the internet at large about our values and our commitment to our users. It's an opportunity I think we should take.