[Foundation-l] Release of squid log data

Gregory Maxwell gmaxwell at gmail.com
Sat Sep 15 01:44:52 UTC 2007


On 9/14/07, Ilya Haykinson <haykinson at gmail.com> wrote:
> If we can find out the
> reason they need IP addresses we can craft the data we send them to
> satisfy their request.  For example:

Two years ago*, when we didn't actually have the data to release, I
proposed a two pronged approach, restated here:

(1) Make as much of the non-private data public as we safely can, this
maximizes the public value of this data and avoids the harm that
picking favorites by sharing valuable data (commercially valuable as
well as a academically valuable) with only certain groups. Plus it
scales much better.

(2) Offer to run reasonable aggregation scripts for those who can
describe a need for access to data we protect. For example, if they
wanted to analyze article views vs country of origin the script could
look up the countries and only disclose that.

If the needs of a researcher can't be met by data scrubbed with a
custom aggregator, then I must question the usefulness of their
research: If it's not possible to convert the research data into an
aggregate result which has no privacy problems then the underlying
data driving their research would be unpublishable, unrepeatable, and
unverifiable.

Keep in mind that well over 99% of the people potentially impacted by
this aren't our "community", they aren't people who have already
agreed to lose a little privacy by making public edits... they are
just readers.

It is my understanding that public libraries do not generally disclose
detailed use records like this for outside research. Google and the
other search engines fought in court to avoid providing the US
government search log data.

I'm also disappointed with the standard of care provided of some other
academic Wikipedia data researchers in recent memory.

So long as there exist *reasonable alternatives* I'm having a hard
time seeing the justification for this proposed disclosure.



*For some reason our own archive of this thread seem to be missing. I
found a third party copy:
http://www.archivum.info/wikipedia-l@wikimedia.org/2005-08/msg00049.html



More information about the foundation-l mailing list