Here's what we all agree on: We want the users of Wikimedia sites to have more control over whether their data is used for application improvement purposes. To be clear, we're not talking about data collected and deleted for operational purposes.

Based on our conversations, we have three choices.

1) We use the divide in interpreting what DNT means to interpret it in a more restrictive way. This has its own advantages and disadvantages as discussed in this list and others.

2) We add an opt-out option that users can use to signal that they don't want any data from them to be collected by us (except for operational purposes). The nice thing about this option is that Wikimedia has control over it and if browser X decides to change their DNT defaults (IE10 example Aaron brought up), we can stay consistent in the choices we provide to users. The downside is that I know it will take some time to implement this and we don't have an interim solution.

3) We use DNT as an interim solution and interpret DNT as "do not log anything from me" and work towards an opt-out option.

If we have capacity to go with option (2) and have it ready in few months, I'd like us to go with that option. Otherwise, option (3) is a reasonable option to me.

Leila

On Thu, Jan 15, 2015 at 7:23 AM, Aaron Halfaker <ahalfaker@wikimedia.org> wrote:

Christian,

It seems that people are well enough informed by the field studies that our team runs to want us to continue to run them. In fact, demand has sky-rocketed both within and outside of the Wikimedia Foundation. You hold a minority opinion that testing software in the field is unnecessary. Yet, field tests are considered a best-practice and have become a critical part of our strategy for minimizing the disruption (and maximizing the benefits) of software changes. I don't think that many people would appreciate your proposed strategy of releasing the software and waiting for people to complain. Given how difficult it is to develop good user-facing software, it's likely that every major deployment would be disruptive if we adopted that strategy. I can speak for a few disruptions that my research helped prevent and some opportunities that it helped us explore.

Allow me to share a specific example. In this study[1], we found that telling anonymous editors to register dropped their productivity by 25%. Yet we didn't identify substantial issues in user testing. If we had not run this field experiment, we might have deployed the change thinking that we were improving Wikipedia when we were really driving good editors away. During the experiment, we received no substantial negative feedback

For a large collection of field experiments that were used to iterate on Wikimedia software, see: https://meta.wikimedia.org/wiki/Growth

Really, what I want to say is this: If you want to improve privacy protections, I am your ally. We're merely disagreeing about whether it is good to assume that DNT means something it wasn't intended to mean or not. However, when you say that my work has no value, it's hard to talk to you productively because, honestly, I don't think your opinion is well-informed.

1. https://meta.wikimedia.org/wiki/Research:Asking_anonymous_editors_to_register/Study_1

-Aaron

On Thu, Jan 15, 2015 at 7:22 AM, Christian Aistleitner <christian@quelltextlich.at> wrote:
Hi,

On Wed, Jan 14, 2015 at 12:07:57PM -0600, Aaron Halfaker wrote:
> For example, not collecting usage data about certain sections of our
> population (e.g. IE10 users where DNT is set by default) means that we
> don't know if our software works for them.

If WMF's main form of QA was through automated usage data collection,
you'd have a point.

But actually, I think WMF is doing better than that.

>From my point of view, a central pillar in QA is “software getting tested”.
That's happening widely across WMF.
Both manually and automated.
It's great already and getting better every day.

And for me the main QA ingredient is listening to feedback from the
users. Besides studies and dog-fooding, WMF's bugtracker is a
testament to that and contains reports that “$X is not working on
browser $Y” or “$X needs to also do $Z”.
And that's really great!

To me, user behaviour data collection is a way to support and assist
the above two. But it is not a requirement when trying to determine
“if our software works for them”.

Users are sending us emails about issues, come to IRC to discuss
issues, file a ticket, or they just tell someone.
All without having their usage data collected.

I am convinced “IE10 users that do not want to unset DNT” are no
exception to that.

Have fun,
Christian

P.S.: I for one received bug reports from IE10 users. (But I do not
know whether or they used DNT.)

--
---- quelltextlich e.U. ---- \\ ---- Christian Aistleitner ----
Companies' registry: 360296y in Linz
Christian Aistleitner
Kefermarkterstrasze 6a/3 Email: christian@quelltextlich.at
4293 Gutau, Austria Phone: +43 7946 / 20 5 81
Fax: +43 7946 / 20 5 81
Homepage: http://quelltextlich.at/
---------------------------------------------------------------

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics

_______________________________________________
Analytics mailing list
Analytics@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics