Daniel Kinzler wrote:
(This mail is in response to a thread on Commons-l; I'm cross posting it to the toolserver list, because it seems relevant there. Ultimately, this should probably be discussed with the WMF and its local associates)
Please note that publishing "intelligence" obtained from analyzing person-related data may be considered a violation of that person's privacy, even if the analyzed data is publicly available - at least under German law and, afaik, EU guidelines. Data mining can expose things about a person that are not easily found out by looking straight at the raw data - this is often problematic, especially since the results can be quite misleading, as per the nature of the methods used.
For some value of "intelligence", yes.
If this is stupid or not is besides the point. We have been asked explicitly by the German Wikimedia e.V. not do make any analysis of user data available on the toolserver, so we won't (although i find it a bit hard to draw a line). If it would be legal for the US based foundation to do it, is a different question. A different question still is if it would be wise and desirable. To quote Wau Holland and the "hacker ethics" of the Chaos Computer Club: "utilize public data, protect private data". Information wants to be free - but so do people. In the end, the latter are more important.
The data we are dealing with *IS* public data.
I personally feel that any analysis that exposes information that is not *relevant* to activity on the project should be strictly opt-in. An example would be a breakup of user activity be time of days or day of the week. I'm not sure about things like the number of untagged images a user has uploaded, for example - that does seem relevant to me. If it's legal or wise to expose such an analysis is an open question to me (actually, I'd like to have some input on this, since my tools can give that statistic).
It's all relevant. Who we should make it available to is possibly a different manner.
On the other hand, I believe that admins should expect to be subject to "public oversight". This is in my opinion an important part of an informal "watch the powerful" mechanism. We already have the ability to see a list of "admin action" an admin has performed. I'm a bit unsure about consolidating that log data into a statistics of deletions per week or whatever - I think we should ask ourselves how useful that would really be. In any case it should be made more obvious to people what "data trails" they leave when working on Wikimedia projects, as a "normal" user and as an admin.
General statistics about admin activity - i.e. sum of all admins, not per person, would be quite interesting, though, and unproblematic.
How about a compromise here? Make the sum totals available to anyone, and the individual breakdowns only available to admins?