Daniel Kinzler wrote:
(This mail is in response to a thread on Commons-l;
I'm cross posting it
to the toolserver list, because it seems relevant there. Ultimately,
this should probably be discussed with the WMF and its local associates)
Please note that publishing "intelligence" obtained from analyzing
person-related data may be considered a violation of that person's
privacy, even if the analyzed data is publicly available - at least
under German law and, afaik, EU guidelines. Data mining can expose
things about a person that are not easily found out by looking straight
at the raw data - this is often problematic, especially since the
results can be quite misleading, as per the nature of the methods used.
For some value of "intelligence", yes.
If this is stupid or not is besides the point. We have
explicitly by the German Wikimedia e.V. not do make any analysis of user
data available on the toolserver, so we won't (although i find it a bit
hard to draw a line). If it would be legal for the US based foundation
to do it, is a different question. A different question still is if it
would be wise and desirable. To quote Wau Holland and the "hacker
ethics" of the Chaos Computer Club: "utilize public data, protect
private data". Information wants to be free - but so do people. In the
end, the latter are more important.
The data we are dealing with *IS* public data.
I personally feel that any analysis that exposes
information that is not
*relevant* to activity on the project should be strictly opt-in. An
example would be a breakup of user activity be time of days or day of
the week. I'm not sure about things like the number of untagged images a
user has uploaded, for example - that does seem relevant to me. If it's
legal or wise to expose such an analysis is an open question to me
(actually, I'd like to have some input on this, since my tools can give
It's all relevant. Who we should make it available to is possibly a
On the other hand, I believe that admins should expect
to be subject to
"public oversight". This is in my opinion an important part of an
informal "watch the powerful" mechanism. We already have the ability to
see a list of "admin action" an admin has performed. I'm a bit unsure
about consolidating that log data into a statistics of deletions per
week or whatever - I think we should ask ourselves how useful that would
really be. In any case it should be made more obvious to people what
"data trails" they leave when working on Wikimedia projects, as a
"normal" user and as an admin.
General statistics about admin activity - i.e. sum of all admins, not
per person, would be quite interesting, though, and unproblematic.
How about a compromise here? Make the sum totals available to anyone,
and the individual breakdowns only available to admins?
Alphax - http://en.wikipedia.org/wiki/User:Alphax
Contributor to Wikipedia, the Free Encyclopedia
"We make the internet not suck" - Jimbo Wales
Public key: http://en.wikipedia.org/wiki/User:Alphax/OpenPGP