If you're producing analyses that call out
individual editors, then yes,
it would be wise to make such tools opt-in.
That makes all the difference. Id also love to see such viz. for my own
edits and probably wouldnt mind sharing it.
And Im not arguing against mining these data for research. I trust that
research will focus on generalized findings,
and in an article will provide an example for which consent had been given.
My point is rather that if we provide generic tools as a service to the
research community the issue of opt-in will sooner or later become mute.
Someone will take the tool, add the category cloud, and start wikigossip.com
(just checked: domain is reserved)
I know this is a general trend anyway, lots of tools already exist that help
you analyze someones presence on the web.
But for every Wikipedian who would rather not, there
are ten more (like
me) that really want more insight into the rich data set of our
On an aggregate level or secure access level, yes. Not to feed our
Im sure no-one here has that in mind and of course I wasnt implicating
Just raising awareness of what it could lead to.
From: Steven Walling [mailto:email@example.com]
Sent: Wednesday, March 23, 2011 18:30
To: Research into Wikimedia content and communities
Cc: Erik Zachte; aforte(a)gatech.edu
Subject: Re: [Wiki-research-l] edit counts for specific users
On Wed, Mar 23, 2011 at 5:46 AM, Erik Zachte <erikzachte(a)infodisiac.com>
In Wikimania Boston, 2006, visualization experts  Fernanda Viégas en
Martin Wattenberg presented a tool which could produce a tag cloud from a
person's edit history. Tag clouds were a novelty and very suitable for the
matter at hand. You could see at a glance that editor Johanna Doe was mainly
engaged in articles about say classic music, and Chinese and Iran politics,
which is OK of course, but maybe better left to the person to disclose at
her own discretion. We discussed implications of the visualization: on one
hand this was all data from the public dumps, and anyone could make such a
script once the idea spread, on the other hand would it be wise to help
facilitate this process. I later found out they decided not to publish the
tool for this very reason.
 See first two entries on http://infodisiac.com/Wikimedia/Visualizations/
That is really sad.
As a Wikipedian, I would hate to see any researcher shy away from publishing
interesting and insightful visualizations of public data.
If you're producing analyses that call out individual editors, then yes, it
would be wise to make such tools opt-in. But for every Wikipedian who would
rather not, there are ten more (like me) that really want more insight into
the rich data set of our editing histories.