[Foundation-l] Release of squid log data

Brock Weller brock.weller at gmail.com
Sat Sep 15 15:31:48 UTC 2007


The only ethical choice? Are you employing hyperbole or serious? Lets
set aside the ip request for now, and just look at the two releases
made so far, without personal info. You're seriously suggesting that
someone infering from edits that person x, without knowing the
identity of that person, likes to edit about kangaroo's and trees, but
not tree kangaroos, used to create a deeper, better understanding of
mankind and the way we function is the unethical choice? Like i said,
leaveing out this new request and just going with the two previous
releases, it would seem almost criminal to not make the non personal
data available for research.

On 9/15/07, Brian <Brian.Mingus at colorado.edu> wrote:
> Wikiresearch-l had a roundtable about this at Wikimania two years ago. We
> reached no conclusion.  I would love to pipe this data through my quality
> classifier, especially combined with the edit histories of the associated
> users. But do you realize what kind of a double whammy that is? Not only do
> you have their surfing habits, you've got their editing habits. On one of
> the largest websites in the world. This data is of unspeakable value not
> only to researchers, but to spammers, would-be identity thieves and others.
>
> Although having this data is a wet dream of mine, I find it unconscionable
> to release it, and I feel that whoever was responsible for releasing it has
> already overstepped their bounds. We already know from the New York Times
> analyzing AOL's search logs that persons can be identified from search logs,
> and we know from Microsoft's Non-Disclosure Agreements with universities
> around the world for portions of the Windows 2000 source code that these
> NDAs, even to universities, are not effective in stopping the data from
> being leaked.
>
> Now that the data has already been released, it is imminent that the
> foundation create an explicit philosophy about data retention policies and
> the circumstances under which user data may be released. I suggest that it
> never be released, and that the foundation hire and/or appoint a
> statistician for analyzing logs in-house. Perhaps this person can act as a
> liaison in certain, well defined situations that do not compromise the
> personal information of anyone beyond what is already available in database
> dumps. This is the only ethical approach in my opinion.
>
> On 9/15/07, Erik Moeller <erik at wikimedia.org> wrote:
> >
> > On 9/14/07, Tim Starling <tstarling at wikimedia.org> wrote:
> > > For a while now, we've been releasing squid log data, stripped of
> > > personally identifying information such as IP addresses, to groups at
> > > two universities: Vrije Universiteit and the University of Minnesota. We
> > > now have a request pending from a third group, at Universidad Rey Juan
> > > Carlos in Spain. They are asking if they can have the full data stream
> > > including IP addresses, and they are prepared to sign a confidentiality
> > > agreement to get it.
> >
> > "Wikimedia will not sell or share private information, such as email
> > addresses, with third parties, unless you agree to release this
> > information, or it is required by law to release the information."
> > http://wikimediafoundation.org/wiki/Privacy_policy
> >
> > Under the current policy I would not support it, even if "private
> > information" is somewhat ambiguous: we must err on the side of
> > caution.
> >
> > I might support a research exemption clause in future versions of the
> > policy _if_ a compelling case can be made that such an exemption is
> > needed, and that no alternative research method would produce results
> > of approximately the same quality. So far no such case has been made.
> >
> > Whatever we do, it is crucial that we make it clear to our users
> > through our privacy policy what is going on. In that spirit, I would
> > also appreciate it if the privacy policy could be updated to describe
> > the existing agreements with universities, and the work that is being
> > done on the toolserver.
> > --
> > Toward Peace, Love & Progress:
> > Erik
> >
> > DISCLAIMER: This message does not represent an official position of
> > the Wikimedia Foundation or its Board of Trustees.
> >
> > _______________________________________________
> > foundation-l mailing list
> > foundation-l at lists.wikimedia.org
> > http://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
> _______________________________________________
> foundation-l mailing list
> foundation-l at lists.wikimedia.org
> http://lists.wikimedia.org/mailman/listinfo/foundation-l
>


-- 
-Brock



More information about the foundation-l mailing list