[Foundation-l] Wikipedia tracks user behaviour via third party companies #2

Peter Gervai grinapo at gmail.com
Fri Jun 5 23:30:59 UTC 2009

Just a few sidenotes now.

2009/6/5 Mark (Markie) <newsmarkie at googlemail.com>:

> There are a few issues with this.  Devs have access to logs on WMF servers,
> not random external servers.

This is a good suggestion, basically you say that I should request the
foundation to provide me a server inside WMF with developer access. I
don't mind that (as long as it have Debian installed).

This is a good (though a bit expensive) _temporary_ solution, since it
only serves huwp. It is not impossible to provide service to other
projects but definitely not for any wp above huwp size, since the
current solution is a hack and do not scale. (And I could process
squid logs, naturally, which is a better way to do it.)

Final solution would be to create either a modified awstats to handle
the stuff better or to write custom code to make it. I don't really
have the time to do these just right now.

> The community cannot decide that Random_user1
> and Random_user2 etc will agree with the communities view on the stats being
> passed to an external server.

As you are aware it's not really random user, so what you write is
more rhetoric and less facts. I debate your statement as I believe the
community can pretty much decide anything unless it violates some
higher level policy, and it's been told this predates the PP. And I
tend to disagree in its violation, but it's an open debate.

>  Also there *may* be issues with the security
> of that server that means it could be compromised and could probably be
> accessed by the web hosting company if they so wished.

Sure, but I happen to be the web hosting company as well. You are
guessing instead of trying to get informed, as others do around. As I
told you the only person accessing the site is myself. And
security-wise there is no 100% security, and it's well possible that
wikimedia servers tunnel all the data to the chinese secret service.
You may trust me to know my job as well. :-)

> I still fail to see how, at this point (not before when there was no policy)
> this can be considered to be acceptable.  IP information etc is still being

Let me help.

> Release: Policy on Release of Data
> It is the policy of Wikimedia that personally identifiable data collected in
> the server logs, or through records in the database via the CheckUser
> feature, or through other non-publicly-available methods, may be released by
> Wikimedia volunteers or staff, in any of the following situations:

It is not the server log, it is not database records, and it is not
other non-publicly-available method by the staff. So the data was not
released by the staff to us. (And we not happened to steal it from
them either.) This complies with the policy.

Now, let's see that volunteer part. We're volunteers, and some can
debate that we are using a non-publicly-available method (even if the
original intent was, in my opinion, clearly to cover methods used on
the WMF servers, and _not_ covering this); in this case the policy
requires us (the volunteers) not to release the identifiable data. And
we comply, since we do not release any personally identifiable data.

Do you see now?

> Except as described above, Wikimedia policy does not permit distribution of
> personally identifiable information under any circumstances.

And it's great that you quoted that, since it shows nicely that we
comply here as well since we do not distribute p.i.i. under any

But we - as huwp - don't stick to this server, as I mentioned, and I'd
gladly put it up on WMF servers, even if this do not really mean or
change anything. But I find it unacceptable that anyone kill off the
stats which was running for plenty of years now, without even trying
to look around. I see that it's pretty easy, since neither of you use
it, it's somebody else's problem. Try to see for a moment like it's

And since it was okay for the past 5 years I'd be glad if you would
continue the discussion WHILE reverting your changes. I don't believe
a few days would make a difference.

And another sidenote: if a newspaper makes a few false statements,
what is the correct way of actions? Telling them [kindly] that they're
stupid or interfering with your own projects and fellow editors? And
which is the easier?


More information about the foundation-l mailing list