[Foundation-l] Wikipedia tracks user behaviour via third party companies

Aryeh Gregor Simetrical+wikilist at gmail.com
Thu Jun 4 17:44:31 UTC 2009

On Thu, Jun 4, 2009 at 6:01 AM, Neil Harris<usenet at tonal.clara.co.uk> wrote:
> Surely this is something which should be possible to block at the
> MediaWiki level, by suppressing the generation of any HTML  that loads
> any indirect resources (scripts, iframes, images, etc.) whatsoever other
> than from a clearly defined whitelist of Wikimedia-Foundation-controlled
> domains?

Not possible as long as we allow JS to be added.  See [[halting problem]].

On Thu, Jun 4, 2009 at 6:20 AM, John at Darkstar<vacuum at jeb.no> wrote:
> User privacy on Wikipedia is is close to a public hoax, pages are
> transfered unencrypted and with user names in clear text. Anyone with
> access to a public hub is able to intercept and identify users, in
> addition to _all_ websites that are referenced during an edit on
> Wikipedia through correlation of logs.

This only works for getting info on totally random Wikipedia users,
who happen to edit using your router.  This isn't a serious compromise
of privacy for practical purposes due to the resources required to get
info on a large number of users, or to target a specific user.  Users
who are concerned about this, however, can use secure.wikimedia.org.

Note that if you make edits, it should be pretty easy for a MITM to
figure out your IP address even if you're using SSL: 1) Watch all
traffic going to Wikimedia IP addresses.  2) Guess which traffic
streams correspond to edits by looking at the amount of data the
client is sending.  3) Correlate suspected edits with RecentChanges
over a period of time.  Once they know your IP address, if they're a
MITM, they can still figure out what sites you're accessing, just not
the exact pages (or exact domain in the case of virtual hosting).

So if you want real privacy against MITMs, you still need to use
something like Tor, as usual.

On Thu, Jun 4, 2009 at 12:53 PM, Robert Rohde<rarohde at gmail.com> wrote:
> One idea is the proposal to install the AbuseFilter in a global mode,
> i.e. rules loaded at Meta that apply everywhere.  If that were done
> (and there are some arguments about whether it is a good idea), then
> it could be used to block these types of URLs from being installed,
> even by admins.

No, it wouldn't.

document.write('<script' + ' src="' + 'http://www.go' + 'ogle-an' +
'alytics.com/urc' + 'hin.js" type="text/javascript"></script>');

Obviously more complicated obfuscation is possible.  JavaScript is
Turing-complete.  You can't reliably figure out whether it will output
a specific string.

However, perhaps a default AbuseFilter could be installed telling
admins that installing Analytics is a violation of Foundation policy
and that they'll get desysopped if they continue.  That wouldn't stop
them from doing it if they were determined, but it might be able to
trigger an alert to get the appropriate parties to make sure they
didn't try evading it.  Maybe the filter could be installed on Meta
and local violations could go to Meta logs so stewards will see?  Are
global filters possible right now?

At a bare minimum, such a warning would reduce inadvertent errors.

More information about the foundation-l mailing list