Re: [Foundation-l] Release of squid log data

15 Sep 2007

On 9/15/07, Ben McIlwain &lt;cydeweys(a)gmail.com&gt; wrote:
[snip]
...
  The AOL search data was
 NOT tagged with pseudonymous data (by which I'm assuming you mean
 usernames).  It was tagged with random numbers.  The way privacy was
 compromised in the AOL search data scandal had nothing to do with what
 the data was labeled as and everything to do with what the data was.
 One could look at all of the searches made by a given person and clue in
 on who they were - e.g. by looking for local subjects in their searches,
 see if they searched for anyone by name (maybe themselves or people they
 knew), see if they searched for any esoteric subjects, etc. 
A unique random ID is a pseudonym.  The ability to tie multiple
searches to the same pseudonym was key, ... while I could guess the
probably identity of a single search in some cases without any
pseudonym it is, as you pointed out, the ability to tie them togeather
which creates trouble.

The point Tim was making was that the data Wikimedia has *previously
released* did not include any sort of identifyer, pseudonominous or
not, and thus doesn't have the same risks.

The data which is *proposed* to be disclosed would include IPs, which
acts as either a pseudonominous identifyer or an outright identifyer.
I doubt Tim would disagree that there are significant privacy
implications in the case of those. Which is, of course, why he said
they were willing to enter into a NDA.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Foundation-l] Release of squid log data