[Foundation-l] Wikipedia tracks user behaviour via third party companies #2

Aryeh Gregor Simetrical+wikilist at gmail.com
Fri Jun 5 21:41:42 UTC 2009

On Fri, Jun 5, 2009 at 5:22 PM, Michael Snow<wikipedia at verizon.net> wrote:
> As I understand it, nobody is arguing that it's considered acceptable at
> this point.

Peter Gervai seemed to argue exactly that, unless I badly misread him:

> someone from outside seriously interfere with other project
> based on, as it turns out, incorrect informations. . . .
> . . . I do not believe the case actually breached any privacy . . .

And so did Tisza Gergő:

> More importantly, the privacy policy explicitly states that developers might
> have access to the raw logs. The stat is thus in compliance with the letter of
> the privacy policy, and I don't see why it would be countrary of its spirit.

The privacy policy clearly prohibits "release" of data to outside
sources for the purpose of statistical analysis, since that doesn't
fall within the six enumerated points under "Release: Policy on
Release of Data".  I suppose it's arguable by the letter of the policy
that sending the data to a server which only a single Wikipedian has
access to isn't "release".  However, I think it's clear that the
intent of the policy was otherwise, and Domas acted in accordance with
established policy and with full understanding of the nature of the
script he was removing.

It might be worth defining "release" more clearly to avoid any
confusion in the future.  Would it have been any different if it was
being sent to the toolserver instead of a totally third-party server,
for instance?  I'd think not, but it's not fully clear from reading
the policy.  How about a checkuser downloading some data to his
computer for analysis beyond that permitted by the web-based
interface?  Why is that not release if downloading it to a server is?
Does that depend on the amount, intent, or some other purpose?  (Or is
it release?  If so, why is it different from downloading web pages so
you can view them in your browser?)

Also, there are multiple places where the policy vaguely and
redundantly states that logs will not be publicized, in multiple ways:
"is not made public", "will not be published", "is not reproduced
publicly".  In general, there's a lot of repetition that makes the
policy hard to draw firm conclusions from.  If you just saw those
mentions, you might think it was just fine to reproduce it as long as
it wasn't actually *public*.  It could use more precise and condensed

More information about the foundation-l mailing list