[Foundation-l] Wikipedia tracks user behaviour via third party companies #2

Mark (Markie) newsmarkie at googlemail.com
Sat Jun 6 10:32:04 UTC 2009

On Sat, Jun 6, 2009 at 12:12 AM, Tisza Gergő <gtisza at gmail.com> wrote:

> Aryeh Gregor <Simetrical+wikilist at ...> writes:
> > I believe the major problems with the script are
> >
> > 1) It sent data to a server not directly controlled by the Wikimedia
> > Foundation.  No personally identifiable information should be sent in
> > bulk to any non-Wikimedia server.  Operation of any server hosting
> > significant amounts of sensitive information must be directly and
> > immediately accountable to Wikimedia's normal chain of command.
> I don't think thats reasonable. WikiMiniAtlas, for example, is hosted by
> WM-DE,
> thus every time it is used, IP data is sent to a non-WMF server. (Users
> have to
> click to load it, but it is linked from every page that has coordinates, so
> it
> can be considered bulk. And when it gets replaced with OSM, static map
> snippets
> will be loaded by default from a WM-DE-owned cache server, if I understand
> the
> setup correctly.)
> Of course, there should be *some* limit on what servers can receive data.
> As I
> said, the obvious choice for me would be to tie it to chapters (maybe it
> could
> even be included in the chapter agreement?). That, and maybe WMF staff
> should
> have root access for emergencies?

afaik roots on the toolserver who have access to logs have to be signed off
by the WMF before they are allowed access.

> > 2) This use of data was not specifically authorized by the Wikimedia
> > Foundation, via either the Board or appropriate officers.  Peter may
> > be a checkuser, but that gives him authorization only to use checkuser
> > functions, not to collect or harvest other types of data.  As has been
> > noted, the data collected includes much more than checkusers can
> > access in the course of using their checkuser rights.
> Agreed. So consider this as a request for authorization :)

he would obviously need to ask for this himself.

> > Last I heard, Erik Zachte is working on improved statistics for all
> > Wikimedia projects.  These are running on Wikimedia servers and
> > specifically approved by Wikimedia.  It seems like the best course of
> > action would be for people to point out what they think is lacking in
> > his statistics, and perhaps offer to help improve them.
> Certainly, but that in itself is no reason not to have another system for
> the
> time being. It is not unheard of that developement of new features get
> delayed
> by a few years :) We have a working system in place; I don't think it
> should be
> removed just becuase there will be a better one at some indefinite point in
> time. It can removed at that time just as well.
> As for statistics-related feature requests, I would have quite a few :)
> Unique
> visits/visitors, referrer data, country/browser/OS distribution (I seem to
> recall seeing something like this in Erik's stats, but I can't find it
> now),
> breakdown by action and by user group, search term statistics (without the
> wikistics.falsicon.de JS hack), gadget usage data. An API would also be
> nice (so
> that for example a user script can query the data for all internal links on
> the
> page, and show a colormap - it would be a nice tool for designing the
> layouts of
> portals).
> (It would be somewhat unfair to say Erik's starts are lacking these, since
> our
> stat can't measure most of them either. What I would miss most would be
> visitor
> counts and browser distribution. Also, I think stats.grok.se and
> wikistics.falsicon.de give slightly incorrect page view results because
> they
> don't take redirects and special pages into account.)
> _______________________________________________
> foundation-l mailing list
> foundation-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l

More information about the foundation-l mailing list