[Foundation-l] Release of squid log data

Ben McIlwain cydeweys at gmail.com
Sat Sep 15 16:43:52 UTC 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Tim Starling wrote:
> Brian wrote:
>> Although having this data is a wet dream of mine, I find it unconscionable
>> to release it, and I feel that whoever was responsible for releasing it has
>> already overstepped their bounds. We already know from the New York Times
>> analyzing AOL's search logs that persons can be identified from search logs,
>> and we know from Microsoft's Non-Disclosure Agreements with universities
>> around the world for portions of the Windows 2000 source code that these
>> NDAs, even to universities, are not effective in stopping the data from
>> being leaked.
> 
> The data that has been released cannot be used to identify individuals. 
>   The AOL search data could be used to identify individuals, because 
> searches were tagged with a pseudonymous identifier. There are no such 
> identifiers in the data we are sending out.

I'm going to assume good faith here and just assume that you simply
don't know what the AOL search data was about.  The AOL search data was
NOT tagged with pseudonymous data (by which I'm assuming you mean
usernames).  It was tagged with random numbers.  The way privacy was
compromised in the AOL search data scandal had nothing to do with what
the data was labeled as and everything to do with what the data was.
One could look at all of the searches made by a given person and clue in
on who they were - e.g. by looking for local subjects in their searches,
see if they searched for anyone by name (maybe themselves or people they
knew), see if they searched for any esoteric subjects, etc.

You would do well to educate yourself on what the AOL search data
scandal actually was, because it seems like we may already be making the
same mistakes without you realizing it.

> For example, a search for a social security number, by itself, tells you 
> nothing about the individual who made it. Was it the owner of the SSN, 
> an employer, or someone going through the man's rubbish? Or was it a 
> Wikipedian trying to determine if someone's SSN is notable enough to 
> include in an article?

Actually, a search for a social security number tells you pretty much
everything you need to know and leads directly to infringement of
privacy.  Many people unmasked in the AOL search data scandal had been
searching for personally identifiable information.

> In the unlikely event that someone types their life story into the 
> search box and clicks "go", you still don't know who wrote it, whether 
> it was autobiographical, slander or fantasy.

You're being unrealistic here.  You're assuming that the person doing
the investigating is a complete moron and isn't able to put one and one
together.  That simply isn't true.  In the AOL search data scandal,
reporters were able to discover many real life identities using
information that was far, far less substantial than a complete life
story.  Something as simple as a few keyword searches for obscure
hobbies and location-specific searches was enough to track some people
down.  After all, how many Yorktown terrier enthusiasts do you think
you're going to find in average Small Town, USA?
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (MingW32)

iD8DBQFG7AvIvCEYTv+mBWcRAoR/AJkBt++a+Rv4iaFUiY2QbcS1vQS2BwCeOr4J
+Kx4eZKwrT+2GaI2eofTD3I=
=YHEV
-----END PGP SIGNATURE-----



More information about the foundation-l mailing list