[Foundation-l] Release of squid log data

Gregory Maxwell gmaxwell at gmail.com
Mon Sep 17 12:54:42 UTC 2007


On 9/17/07, Andrew Gray <shimgray at gmail.com> wrote:
> Spain has (multiple) laws which make unauthorised or negligent release
> of personally identifiable data a criminal offence, and a government
> which generally respects them; Iran and China do not. Invoking the
> Evil Totalitarian Bogeyman isn't really helpful, here; there's no
> reason we have to treat all applicants the same way.

Great, so we're stuck picking and choosing universities.. giving more
access to some people than others... It's just ugly and should be
avoided as much as possible.

> So you make data storage security part of the agreement. Frankly, I
> see absolutely no reason we should trust a competent, serious,
> university research department any *less* than we trust WMF's ability
> to keep tabs on who's got hold of the data...

It shouldn't be a question of trusting anyone less, it's a question of
exposed surface. WMF has access to the data regardless, allowing a
larger pool to have access to the data increases exposure.

And for what purpose? That data any valid research will need is data
that can be satisfied from some kind of anonmization or aggregation.
If it can't be, then the researcher couldn't publish at all: If not
anonymous or aggregated data what would they publish?

Yes. It's nicer to just have access to the data... especially when
you're not starting with a concrete hypothesis but when you instead
just want to move the random numbers around and see what hypothesis
falls out...  But dealing with the privacy implications is a reality
of this kind of work.

And frankly, if it turns out to be harder to have WMF aggregate the
data in a way thats useful for your research than it is to protect the
data yourself something is wrong.



More information about the foundation-l mailing list