Re: [Foundation-l] Release of squid log data

15 Sep 2007

On 9/14/07, Ilya Haykinson &lt;haykinson(a)gmail.com&gt; wrote:
...
  On 9/14/07, Tim Starling
&lt;tstarling(a)wikimedia.org&gt; wrote:
  I wouldn't recommend using a hashed IP
address to anyone involved in
 academic work. I've worked in the academic sector, I know how important
 it is for data to be above any criticism. Any data using unique IP
 addresses as an estimate of individual user population would be severely
 skewed by proxies and NAT. 
 Perhaps in order to prevent potentially violating our own privacy
 policy, we can meet the researchers half-way. 
The best way to avoid violating the privacy policy would be to change
it to say exactly what it is you plan on doing, and to not give data
from before the policy is changed.

...
  If we can find out the
 reason they need IP addresses we can craft the data we send them to
 satisfy their request.  For example:

 a) they could just need the unique addresses to link together browsing
 patterns, but not care for them to be IP addresses.  We could create
 convert the addresses into a unique number (or a salted hash) and send
 them the data.
 In case anyone's seriously considering this, make sure you've read
[[AOL search data scandal]] which should show you why it's completely
useless.  This is *especially* true with Wikipedia data, where the
urls we access constantly reveal who we are (e.g.
http://en.wikipedia.org/wiki/User_talk:Whatever).

...
  b) they could be looking for network topology
information; we could
 give them the first two or three octets of the IP address.
 Three octects would be almost as bad as a) for the same reasons.  Two
octets would be better, but less useful too.

...
  c) they could be looking for geographical distribution
of queries; we
 could do the geo-lookup of addresses and give them coordinate
 resolution for each address instead of the address itself.
 If that geo information is limited to country, I guess it wouldn't be too bad.

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Foundation-l] Release of squid log data