On 23/08/07, Armed Blowfish <diodontida.armata(a)googlemail.com> wrote:
On 17/08/07, madman bum and angel
<madman(a)ferretproductions.com> wrote:
Luna wrote:
For clarification, does this mean that discussion
regarding AB in
particular, or discussion on proxies in general? I'm of the opinion that
the
former issue was distracting from productive
resolution of the greater
matter at hand: trying to find a better working solution to the proxies
problem. I'd say more, but want to wait for that clarification.
Thanks for re-focusing us. I'm working on a solution myself; I hope to
be able to post a proposal for review within a couple days.
-madman bum and angel
I tried 're-focus' this a long time ago.
To quote myself, 'I asked for my RfA to be blanked for a reason.
Really, if you want to
argue for or against allowing Tor editors, that's fine, but could my
RfA please be left out of it?' (13 August 2007 10:20)
'To clarify on that, consider good Tor users and exit node operators
who have never contributed to Wikipaedia. They cannot be said to have
violated policy, since they have obeyed it by not editing, either when
most of Tor was softblocked, or by evading Tor blocks while most of
Tor has been hardblocked. (Well, unless you want to say that exit
node operators allowing exits to Wikipaedia are 'violating policy' by
doing so... why some people think Tor exit policies are in
Wikipaedia's jurisdiction, I don't know....)
It would be nice if those Tor users and exit node operators could
edit, after being authenticated as trusted. On the Tor IRC channel,
Wikipaedia is complained about more than any other site, by polite
individuals. However, I myself have no interest in getting
unbanned/unblocked/whatever, and said RfA is a source of distress for
me, so it would be nice if you could leave that out of the debate.'
(13 August 2007 10:40)
But apparently no one reads what I write.
Actually, in case you are interested, I tried to suggest things which
would be helpful for other Tor users but not for me a long long time
ago.
This was on my talk page before I asked it to go poof. If you are
interested, I can add some information about honesty and deception in
the animal world, and how that relates to Wikipaedia.
Sorry for the formatting, paste it into Wikipaedia and hit preview to
look at it, I guess.
------------------------------------------------------------------------------------------------
== Purpose of blocking Tor ==
''Tor is blocked for economic reasons, because the high quantity of
destructive edits that could come through it aren't worth putting up
with.''
The main purpose of blocking [[Tor (anonymity network)|Tor]] is to
reduce the quantity of destructive edits to Wikipaedia. Certainly,
Wikipaedia gets destructive edits from other places as well, but open
proxies seem to be used for that sort of thing more often than other
IP addresses. The complex Sybil attacks on Wikipaedia are fairly
unique, but Wikipaedia is in a better position to defend against them
than many other systems. However, blocking Tor does result in
collateral damage of editors who mean well, and there are other ways
of achieving the purpose of protecting Wikipaedia, which is what the
rest of this essay examines.
Some quotes:
: 'One of the things we currently do is block Tor. I consider that a
reasonable solution to the vandalism problem, but an unfortunate
thing, since to my mind, Tor is something very good.
: 'It would be nice if we could look at the edits coming from Tor and
say "Oh, these are fine, they are mostly responsible edits." It'd
even be ok if we could look at the edits coming from Tor and say "Ok,
so there's a touch more vandalism from these than from other ip pools,
but there's also some good stuff coming through from places where we
normally don't see a lot of editing activity. We'll put up with it."
: 'As it is now, we look at it and say "oh, jesus".'
: — Jimbo Wales <ref name="jimbo1">{{cite web
|url=
http://archives.seul.org/or/talk/Sep-2005/msg00353.html
|title= Re: Hello directly from Jimbo at Wikipedia
|accessdate= 2007-06-18
|last= Wales
|first= Jimmy
|authorlink= Jimmy Wales
|date= [[2005-09-29]]
|publisher= or-talk Tor mailing list
|quote=
}}</ref>
: 'Hey folks -- the reason that Wikipedia (and other services) use IPs
to block users is not stupidity, laziness, or ignorance. People use
IP-based blocking because it limits abuse better than no blocking at
all. Blocking IPs is not saying, "I hate privacy, I think IPs do and
should map 1:1 to human beings, and abuse is an ISP problem; and Tor
doesn't exist." It's saying, "I can't deal with the abuse I'd
see if
I didn't block some IPs, and while IP blocking is imperfect, it's
about as good as any other scheme I have had the time so far to
implement."' — Nick Mathewson, Tor developer <ref
name="nickm1">{{cite
web
|url=
http://archives.seul.org/or/talk/Sep-2005/msg00260.html
|title= Re: Hello directly from Jimbo at Wikipedia
|accessdate= 2007-06-18
|last= Mathewson
|first= Nick
|date= [[2005-09-27]]
|publisher= or-talk Tor mailing list
|quote=
}}</ref>
== The mixed nature of Tor, and disadvantages of blocking it completely ==
''Both the well-meaning and the ill-intentioned use Tor - blocking Tor
completely prevents not only bad but also good.''
Both helpful and harmful people, as well some some people who are not
clearly helpful or harmful, use Tor. In fact, most of the ''really
bad'' people don't use Tor. Tor does not provide the best anonymity,
but unlike many of the methods criminals use, it does provide legal
and ethical anonymity, since the Tor nodes are all volunteers.
However, Tor still gets more than its share of (mostly) law-abiding
trouble-makers.<ref name="faq-abuse">{{cite web
|url=
http://tor.eff.org/faq-abuse.html.en#WhatAboutCriminals
|title= Abuse FAQ for Tor Server Operators: Doesn't Tor enable
criminals to do bad things?
|accessdate= 2007-06-18
|date= [[2007-06-17]]
|publisher= The Tor Project
|quote=
}}</ref>
But Tor also has well meaning users. People who use Tor to bypass
censorship (consider the [[Wikipedia:WikiProject Countering systemic
bias|systemic bias]] implications),<ref
name="entranceblockingresistance"> {{cite web
|url=
http://cvs.seul.org/viewcvs/viewcvs.cgi/*checkout*/tor/trunk/doc/design-pap…
|title= Design of a blocking-resistant anonymity system (Draft - revision 10168)
|author = Roger Dingledine and Nick Mathewson
|accessdate= 2007-06-18
|date= [[2007-05-12]]
|publisher= The Tor Project
|quote=
}} </ref> people who use Tor to get around restrictive firewalls,<ref
name="firewalledclient"> {{cite web
|url=
http://wiki.noreply.org/noreply/TheOnionRouter/TorFAQ#FirewalledClient
|title= My firewall only allows a few outgoing ports
|work= TheOnionRouter/TorFAQ
|accessdate= 2007-06-18
|date= [[2007-06-11]]
|publisher= Noreply Wiki
|quote=
}}</ref> and other random and well-meaning people who want or need
privacy.<ref name="toroverview>{{cite web
|url=
http://tor.eff.org/overview.html.en
|title= Tor: Overview
|accessdate= 2007-06-18
|date= [[2006-08-22]]
|publisher= The Tor Project
|quote=
}}</ref>
Ideally, we should like to separate the well-meaning users from the bad.
Some more quotes:
: 'Let me tell you what I love. I love the Chinese dissident who
wants to work on Wikipedia articles in safety. I love that Wikipedia
is an open platform that allows people to have that voice, and that we
can have a positive impact on the world in large part because we don't
bow to censorship and we are willing to reach out and work with people
like Tor to empower individuals to speak, no matter what sort of
oppressive conditions they face.
: 'WE ARE ON THE SAME SIDE.'
: — Jimbo Wales <ref name="jimbo2">{{cite web
|url=
http://archives.seul.org/or/talk/Sep-2005/msg00253.html
|title= Hello directly from Jimbo at Wikipedia
|accessdate= 2007-06-18
|last= Wales
|first= Jimmy
|authorlink= Jimmy Wales
|date= [[2005-09-27]]
|publisher= or-talk Tor mailing list
|quote=
}}</ref>
: 'I want there to be a great method for people to be able to edit
Wikipedia safely and securely no matter what their personal situation
may be, and I want that method to be sufficiently abuse-free that we
can allow it.' — Jimbo Wales <ref name="jimbo3">{{cite web
|url=
http://archives.seul.org/or/talk/Sep-2005/msg00315.html
|title= Re: [roy(a)rant-central.com: Re: [arma(a)mit.edu: Re: Wikipedia & Tor]]
|accessdate= 2007-06-18
|last= Wales
|first= Jimmy
|authorlink= Jimmy Wales
|date= [[2005-09-28]]
|publisher= or-talk Tor mailing list
|quote=
}}</ref>
== Privacy with minimal damage to Wikipaedia? ==
Nick Mathewson, a Tor developer, summarised many of the key points
quite nicely during a discussion between Jimbo and the good people of
Tor in 2005:
: 'To everybody in this discussion: here are some things that might
make you feel better in the short run, but which will ultimately not
help.
:: - Trying to convince Jimbo that privacy is good, or that reducing
false positives would serve wikipedia's goal of openness. He knows.
:: - Trying to convince Tor developers that abuse is bad, or that
reducing abuse would server Tor's goal of widespread acceptance. We
know.
:: - Trying to convince Tor developers to subvert services attempts to
block Tor exit connections{4} based on IPs. We won't; it would be
wrong.
:: - Trying to convince Wikipedia operators that privacy is evil _per
se_, and should be thwarted regardless of potential for abuse in
particular instances. I doubt they'd buy it; it doesn't look like
Jimbo will.
:: - Trying to convince the world that some wikipedians have an
unnuanced view of Tor and anonymity and false-positives. We know.
:: - Trying to convince the world that some Tor operators and users
have an unnuanced view of IP blocking and abuse. We know.
: 'Here are some things that would be harder, but which would probably
be useful:
:: - Try to develop better understanding of why and whether abuse
prevention mechanism, even the ones you think are crappy, work in
practice.
:: - If a hypothetical abuse prevention mechanism wouldn't work,
explain why not.
:: - When it looks like somebody is saying something utterly stupid or
insane, try to figure out why, from their point of view, it might seem
reasonable to say such a thing.{5}
:: - Come up with workable ways to prevent abuse that don't damage
privacy or preclude anonymizing layers like Tor.
:: - Implement those models, and try them out.
: 'There are probably other helpful things, too.'
: — Nick Matthewson <ref name="nickm1" />
== Internet Protocol (IP) addresses do not map 1:1 to human beings, or
even computers ==
''IP addresses are not people - they are changed and shared in various
ways, even without open anonymising proxies.''
Consider the following situations:
* Shared computers - [[public computer]]s such as those in libraries
and internet cafes, as well as more private computers shared by people
who are living together, such as family members. Note that many of
these may be behind a NAT (see later) anyway.
* [[Dynamic Host Configuration Protocol|Dynamic IP addresses]] - IPv4
addresses are limited in quantity, and IPv6 is not catching on. So,
rather than needing an IP address for each separate client (where a
client may be a computer or a NAT, as explained later), an Internet
Service Provider (ISP) only needs enough IP addresses for the number
of clients that are online at one time. The automated nature of
Dynamic Host Configuration Protocol (DHCP) also makes it convenient.
Note that there are different degrees of dynamicness - an IP address
is assigned for a limited but adjustable period of time, and a DHCP
server may or may not try to give a client the same IP address as it
had before (note that there are ways of getting the DHCP server to
give you a new IP address).<ref name="sun">{{cite web
|url=
http://www.sun.com/software/whitepapers/wp-dhcp/dhcp-wp.pdf
|title= Dynamic Host Configuration Protocol: Technical White Paper
|accessdate= 2007-06-18
|year= 2000
|month= August
|format= [[PDF]]
|publisher= Sun Microsystems, Inc
|quote=
}} Also see Google's
[
http://216.239.51.104/search?q=cache:G4oLBWItHb8J:www.sun.com/software/whit…
HTML cache].</ref>
* Overloading or overlapping [[network address translation]] (NAT) -
This is another solution to the IPv4 address scarcity problem.
Basically, multiple clients share the same IP address, with a router
converting packets sent to that IP address to packets sent to various
local IP addresses, and vice versa. This can be done at multiple
levels - a household, a school, library, or business, or even an
entire ISP. (Consider the collateral damage that could be caused by
blocking a Tor exit node which is running behind a large NAT.<ref
name="">{{cite web
|url=
http://archives.seul.org/or/talk/Sep-2005/msg00257.html
|title= Re: Hello directly from Jimbo at Wikipedia
|accessdate= 2007-06-18
|last= Syverson
|first= Paul
|date= [[2005-09-27]]
|publisher= or-talk Tor mailing list
|quote= This is potentially a bigger problem than it may appear. On
the one hand, services should be allowed to refuse connections from
sources of possible abuse. But when a Tor node administrator decides
whether he prefers to be able to post to Wikipedia from his IP
address, or to allow people to read Wikipedia anonymously through his
Tor node, he is making the decision for others as well. (For a while,
Wikipedia blocked all posting from all Tor nodes based on IP
addresses.) If the Tor node shares an address with a campus or
corporate NAT, then the decision can prevent the entire population
from posting. This is a loss for both Tor and Wikipedia: we don't
want to compete for (or divvy up) the NAT-protected entities of the
world.
}}</ref>)<ref name="cisco">{{cite web
|url=
http://www.cisco.com/en/US/tech/tk648/tk361/technologies_tech_note09186a008…
|title= How NAT Works
|accessdate= 2007-06-18
|date= [[2006-01-24]]
|work= IP Addressing Services
|publisher= [[Cisco]]
|quote=
}}</ref>
* [[AOL]] - Think of it like an overloading or overlapping NAT where
the public IP addresses keep rotating (dynamic NAT).<ref
name="aol">{{cite web
|url=
http://webmaster.info.aol.com/proxyinfo.html
|title= AOL Proxy Info
|accessdate= 2007-06-18
|date= [[2005-06-14]]
|publisher= America Online, Inc
|quote=
}}</ref> A good solution to the AOL problem is X-Forwarded-For, but
this only works because AOL is not actually trying to protect the
privacy of its users. See Meta's [[m:XFF project|XFF project]].
* Things change. People switch ISPs.<ref name="eweek">{{cite web
|url=
http://www.eweek.com/article2/0,1759,290497,00.asp
|title= Need For Speed Drives Customer Churn
|accessdate= 2007-06-20
|last= Wetzel
|first= Rebecca
|date= [[2001-09-07]]
|publisher= eWeek
|quote= About 24 percent of respondents say they switched ISPs in the
past year, compared to about 23 percent in 2000 and 20 percent the
year before. About 28 percent say they are extremely or somewhat
likely to switch next year.
}}</ref> People move.
Unless you are paying for a server-quality internet connection,
chances are your IP address is dynamic, you are behind a NAT, or quite
possibly both.<ref name="cisco" />
Note that IP addresses [[Internet_privacy#ISPs|can generally be traced
back to a particular computer or home/office router, and ISP bill]],
but this tends to require either the cooperation of the ISP, or
cracking into the ISP, which are not things Wikipaedia does.<ref
name="adrian">[[User talk:Adrian|Ask]] [[Adrian Lamo]].</ref>
== IP-based blocking, and why it works better than nothing ==
''While IP-based blocking is far from perfect, and could not actually
prevent someone from editing, even without open anonymisng proxies, it
is economic - at least you force the person to find another IP address
or range before he or she can harm Wikipaedia again.''
As we have established
[[#Internet_Protocol_.28IP.29_addresses_do_not_map_1:1_to_human_beings.2C_or_even_computers
|above]], IPs do not correlate exactly to individual people. However,
ISP-issued IPs tend to be a costly resource. Yes, AOL IPs rotate
frequently, but you actually have to purchase an AOL connection. You
might be in a NATed library, but using that IP requires your physical
presence in the library. Your entire ISP may share one IP address, but
again, you have to purchase a connection from that ISP. If you have a
dynamic IP address not shared by that many people, perhaps just the
people you live with, which tends to stay the same upon renewal,
IP-based blocking works even better. However, there will always be
collateral damage, and there will always be ways to bypass such
blocks.
As Nick Mathewson said, 'People don't block IPs because they think IPs
are people, or because they've never heard of NAT. They block IPs
because IPv4 addresses are (for most people, at the moment, to a first
approximation) a somewhat costly{1} resource. When they block Bob's
IP, the theory is that they force him spend the effort to move to a
new IP before he can abuse their service again.{2}'<ref name="nickm1"
/>
Ultimately, the goal is to wear the undesired user out, and get them
to exercise their right to leave.<ref>{{cite web
|url=
http://www.usemod.com/cgi-bin/mb.pl?CommunityExile
|title= CommunityExile
|accessdate= 2007-06-18
|publisher= Meatball Wiki
|quote=
}}</ref>
The problem with open proxies is that they are not a costly enough
resource - anyone in the world with an internet connection can use
Tor.
== Destructive-editing-resistant Tor unblocking ==
'''Editor's note: Haven't actually finished what follows. It's
rather
outliney, sorry.'''
=== Softblock Tor IPs ===
''Softblocking, as presently implemented, does not work because it is
too easy to create accounts.''
In theory, by restricting Tor users to editing from an account rather
than simply editing directly as Tor IP, we can then block the
destructive ones individually. If account creation is uniformly
disabled for open proxies, then to get an account they would have to
go through the effort of emailing unblock-en-l.<ref
name="jimbosoftblock"><!-- This message just shows Jimbo considering
the idea of softblocking, not the technical theory I gave. -->{{cite
web
|url=
http://archives.seul.org/or/talk/Sep-2005/msg00297.html
|title= Re: Wikipedia & Tor
|accessdate= 2007-06-18
|last= Wales
|first= Jimmy
|authorlink= Jimmy Wales
|date= [[2005-09-27]]
|publisher= or-talk Tor mailing list
|quote= Putting Tor users into a "soft block" mode is a reasonable
thing to do, but I'll have to think about how we might want to do it.
}}</ref> In practice, account creation is too cheap. This didn't
work.<ref name="thatcher131">{{cite web
|url= {{fullurl:Wikipedia_talk:Blocking_policy|diff=next&oldid=129596558}}
|title= Wikipedia talk:Blocking policy (diff)
|accessdate= 2006-06-21
|author= Thatcher131
|authorlink= User:Thatcher131
|date= 19:03, [[2007-05-09]]
|publisher= Wikipaedia, The Free Encyclopaedia
|quote= TOR proxies were always blocked. When softblocking became
possible, it was tried as an experiment. The experiment failed. The
checkusers have repeatedly discovered serious vandalism coming from
softblocked anonymous proxies (not just TOR).
}}</ref><ref name="jayjg">{{cite web
|url= {{fullurl:User_talk:Jayjg|diff=129415904&oldid=129415589}}
|title= User talk:Jayjg (diff)
|accessdate= 2006-06-21
|author= Jayjg
|authorlink= User:Jayjg
|date= 02:33, [[2007-05-09]]
|publisher= Wikipaedia, The Free Encyclopaedia
|quote= I'm not talking about Main page vandalism, or password
cracking, which are recent and ephemeral problems. I'm talking about
run of the mill nastiness. There's lots of it out there, and people
are getting away with it using TOR proxies. That has to stop.
}}</ref> Also see [[Wikipedia:Blocking policy proposal]].
=== Masking Tor exit nodes ===
''Collecting Tor users behind a single IP address or mask would make
it easier to play temporary blocking games, and to check up on edits
coming through Tor.''
It is rather hard to change the blocking options for Tor exit nodes
all at once, since the IPs are all over the place. If Tor exit nodes
were hardblocked but a separate service, e.g. a hidden service, were
set up which Tor users could connect to, that separate service could
use it's own IP, giving Wikipaedia one IP to use to change the
blocking options for Tor users. This would allow Wikipaedia to play
games with temporary blocks during periods of widespread Tor misuse,
and give Checkusers a single IP to check. It would also help give
well-meaning Tor users the opportunity to establish a reputation.
Freenode already does this, using two hidden services - one which any
Tor user may connect to but is blocked during periods of general Tor
misuse, and one which requires e-mailing the freenode staff with their
public key, etc.<ref name="freenode">{{cite web
|url=
http://freenode.net/irc_servers.shtml#tor
|title= Accessing Freenode Via Tor
|accessdate= 2007-06-21
|year= (c) 2002-2007
|work= IRC Servers
|publisher= Peer-Directed Projects Center
|quote=
}}</ref>
[[#An_old_patch |A patch]] written by Adam Langley, but never
committed, would accomplish something similar.
Note that this might not do much in terms of sockpuppetry
prevention/detection (discussed later).
=== Tor pseudonymity system ===
''Building an authentication system into Tor would break Tor's
security.''
It has been suggested that some form of authentication be implemented
in Tor, in order to make it more usable. However, Tor is not a
pseduonymity network. Tor is an anonymity network. Pseudonymity
would basically be the equivalent adding a backdoor (i.e. intentional
vulnerability) to Tor.
: 'Actually, we're not in any better position than you are. We don't
know who our userbase is either; we certainly don't have identities
for them, and we really don't want to track their identities or
trustworthiness, for a number of reasons:
:: - If it were easy for us to tell what individual users were doing
with Tor, it would be easy for *everybody* tell what individual users
were doing. I wish we could separate good users from bad without
seeing what they were doing, but without linking them to the actual
contents of their communication, it isn't really possible.
:: - We don't want people to have to trust us with their secrets. It
would make us a great target for malicious hackers and legal attacks.
:: - Our standard of trust is not likely to be anyone else's.
:: - We are not a community service; our operators don't know our
users, and our users don't know each other, except when they choose to
communicate on forums like this one. This is necessary for privacy:
if the community knows who's who on the network, so does the Chinese
government.'
: — Nick Matthewson <ref name="nickm2">{{cite web
|url=
http://archives.seul.org/or/talk/Sep-2005/msg00299.html
|title= Re: Hello directly from Jimbo at Wikipedia
|accessdate= 2007-06-20
|last= Mathewson
|first= Nick
|date= [[2005-09-27]]
|publisher= or-talk Tor mailing list
|quote=
}}</ref>
=== Extra-Tor pseudonymity system ===
''An authentication layer on top of Tor could be voluntary, but basing
it on IPs would leave many unsolved problems.''
The pseudonym system could be external to Tor, making it optional for
those who would prefer to be completely anonymous. This might be a
separate authentication server, or Wikipaedia's own authentication
system.
The good people of Tor [[User:Lunkwill/nym|pro]][[User
talk:Lunkwill/nym|posed]] a piece of software called 'Nym' for this.
Nym basically provides a thin wrapper around an IP address not
believed to be a Tor exit node or other blocked IP address, so it
would not be helpful to users who are truly interested in privacy or
who run exit nodes. It would, however, be helpful to users who use
Tor for reasons other than privacy, as well as non-Tor users who spend
much but not all of their time behind some shared, frequently blocked
IP address, such as a school.<ref name="nym-paper">{{cite web
|url=
http://lunkwill.org/cv/nym.pdf
|title= nym: practical pseudonymity for anonymous networks
|accessdate= 2007-06-28
|author= [
http://lunkwill.org/cv/ Holt, Jason E.]
|month= October
|year= 2005
|quote=
}} </ref><ref name="nym-ortalk-wrote">{{cite web
|url=
http://archives.seul.org/or/talk/Oct-2005/msg00229.html
|title= Re: Wikipedia and Tor - a solution in the works?
|accessdate= 2007-06-28
|last= Holt
|first= Jason
|date= [[2005-10-29]]
|publisher= or-talk Tor mailing list
|quote=
}} </ref><ref name="nym-bugzilla">{{cite web
|url=
http://bugzilla.wikimedia.org/show_bug.cgi?id=3729
|title= Bug 3729 - Patch: SSL client certificate authentication
|accessdate= 2007-06-30
|author= Jason, Evaldo Gardenali, Timo Jyrinki
|publisher= Wikimedia Bugzilla
|quote=
}} </ref><ref name="nym-ortalk-vote">{{cite web
|url=
http://archives.seul.org/or/talk/Dec-2005/msg00003.html
|title= Voting for nym
|accessdate= 2007-06-28
|last= Holt
|first= Jason
|date= [[2005-12-02]]
|publisher= or-talk Tor mailing list
|quote=
}} </ref><ref name="nym-client">{{cite web
|url=
http://lunkwill.org/src/nym/javascript/jsnymclient.html
|title= Javascript nym interface
|accessdate= 2007-06-28
|last= Holt
|first= Jason E.
|date= [[2006-03-04]]
|quote=
}} </ref><ref name="nym-source">{{cite web
|url=
http://lunkwill.org/src/nym/
|title= Nym source repository
|accessdate= 2007-06-28
|last= Holt
|first= Jason E
|quote=
}} </ref>
One idea for a separate authentication system involved nymbles.<ref
name="nymble">{{cite web
|url=
http://www.petworkshop.org/2007/papers/PET2007_preproc_Nymble.pdf
|title= Nymble: Anonymous IP-Address Blocking
|accessdate= 2007-06-23
|author= Peter C. Johnson, Apu Kapadia, Patrick P. Tsang, and Sean W. Smith
|year= 2007
|publisher= [
http://www.petworkshop.org/2007/program.php 7th Workshop
on Privacy Enhancing Technologies], Ottowa, Canada
|quote=
}} </ref> Some flaws:
* While it does provide more layers of protection than Nym, it is
still significantly less secure than Tor - the users with the greatest
perceived need for privacy probably won't use it.
* Since it still relies on IP addresses as a scarce resource, it would
not help Tor exit node operators.
* Less entrance blocking resistance than Tor - the Chinese ISPs might
actually bother to block it.
A pseudonymity system which uses some scarce resource besides IP
addresses would better protect the privacy of Tor users.
=== Increasing the cost of nym creation ===
''Non-proxy IP addresses work economically because they are scarce,
but there are other scarce resources; however human resources work
better than computer resources.''
<!-- Editor's note: There was significant discussion in the or-talk
mailing list archives on this, which I should really look up.
-->Basically, there are a number of resources of varying degrees of
scarcity, not only IP addresses but also e-mail addresses (ISP-issued
ones are more expensive than free ones), puzzle-solving, etc.
Proof-of-work where a computer could do the work doesn't work.<ref
name="proofofworkprovesnottowork">{{cite web
|url=
http://www.cl.cam.ac.uk/~rnc1/proofwork.pdf
|title= "Proof-of-Work" Proves Not to Work
|accessdate= 2007-06-28
|author= Ben Laurie and Richard Clayton
|publisher= University of Cambridge
|quote=
}} </ref> Hence, how can we raise the human-effort cost of nym
creation? (Note that traditional, non-proxy IPs are theoretically
somewhat costly in human effort to change, which is why
[[#IP-based_blocking.2C_and_why_it_works_better_than_nothing |IP-based
blocking works better than nothing]].) The goal is to make it
difficult enough to authenticate as a good user via Tor, that most of
the bad users will simply switch to not-good open proxies rather than
go through the whole thing in exchange for a few minutes of vandalism,
while good users will only need to go through the process once
(assuming they don't [[#When_pseudonymity_isn.27t_anonymous_enough
|find such a pseudonym to compromise their anonymity too much]]).
As Jimbo said,
: 'For now the key thing to do is to shift the incentives on the bad
users so that Tor is less desirable for them than playing with the
broken proxies or just doing whatever with a dialup account or aol
addresses or whatever.'<ref name="jimbo-incentives"> {{cite web
|url=
http://archives.seul.org/or/talk/Sep-2005/msg00332.html
|title= Re: Wikipedia & Tor
|accessdate= 2007-06-28
|last= Wales
|first= Jimmy
|authorlink= Jimmy Wales
|date= [[2005-09-28]]
|publisher= or-talk Tor mailing list
|quote=
}}</ref>
An excerpt of Jimbo's comments from later in the conversation,
: 'So the _degree_ of trust we need is actually quite small. It isn't
"We certify this person to be a certain user, guaranteed, the same as
ever". It's just "this packet is being sent to you from a source that
has somehow tended generally to lead us to believe to some small
extent that the person posting it has not been a jackass, by and
large".
: 'Or, as has been brilliantly discussed here already, it could be
"this packet has been sent to you via a mechanism that one might
bother to use, were one a dissident really needing anonymity, but
sufficiently bothersome that were one simply a lunatic on crack, one
would more likely have simply switched to using anonymous proxies".
: 'It won't be perfect, but as an empirical matter, it's probably good
enough.'<ref name="jimbobothersome">{{cite web
|url=
http://archives.seul.org/or/talk/Sep-2005/msg00356.html
|title= Re: Hello directly from Jimbo at Wikipedia
|accessdate= 2007-06-28
|last= Wales
|first= Jimmy
|authorlink= Jimmy Wales
|date= [[2005-09-29]]
|publisher= or-talk Tor mailing list
|quote=
}}</ref>
=== Increasing the human-effort of nym creation ===
''Requiring Tor users to create or improve an article to be unblocked
could actually provide a much higher level of security for Wikipaedia
than IP-based authentication, although it depends on the effort
required of them.''
While we could do some complex puzzle system,<ref name="murdoch">{{cite
web
|url=
http://archives.seul.org/or/talk/Sep-2005/msg00340.html
|title= Re: Hello directly from Jimbo at Wikipedia
|accessdate= 2007-06-28
|last= Murdoch
|first= Steven J.
|date= [[2005-09-29]]
|publisher= or-talk Tor mailing list
|quote=
}}</ref> remember that we are an encyclopaedia project, so the most
natural puzzle for us would be the improvement of a Wikipaedia
article. Blocked users can still edit their talk pages. So, the user
first creates an account - we can either hardblock Tor IPs with
account creation enabled, or they can e-mail unblock-en-l. Once they
get an account, they can now edit their talk page. Paste neglected
Wikipaedia article into talk page. Improve Wikipaedia article, using
{{tl|helpme}} as needed. After going through this effort and
authenticating as a good user, use {{tl|unblock}}. The user may be
granted ipblock-exempt (see
[
http://bugzilla.wikimedia.org/show_bug.cgi?id=3706 bug report] and
[[Wikipedia:Wikipedia Signpost/2007-01-08/Technology
report|Signpost]]), meaning the hardblocks won't apply to them. If
they do start causing trouble, they can be easily blocked. A similar
idea is suggested in {{tl|2ndchance}}.
This should in fact provide greater security than IP-based
authentication. Consider the difference between 'assessment signals'
and 'conventional signals'. Assessment signals cannot easily be
faked, since showing the signal requires possessing the quality.
'Conventional signals' are cheap but can easily be faked, diluting
their value. It's the difference in showing strength by having a
thick neck and wearing a t-shirt that says 'Gold's Gym Powerlifter'.
The writing of an encyclopaedia article is an assessment signal, since
it shows a) willingness to put in effort, consistent with a real
concern for privacy and b) skill (or at least interest) in
contributing to the encyclopaedia. The possession of a non-proxy IP
address is only a conventional signal, and not even one which says
much.<ref name="donath-modelsofhonestyanddeception">{{cite web
|url =
http://smg.media.mit.edu/people/Judith/Identity/IdentityDeception.html#29347
|title = 'Models of honesty and deception'. Identity and deception in
the virtual community
|accessdate = 2007-07-09
|last = Donath
|first = Judith S.
|date = [[1996-11-12]]
|work = Communities in cyberspace
|publisher = Berkeley: University of California Press
|pages = 1
|quote =
}} Also available on
[
http://citeseer.ist.psu.edu/donath97identity.html CiteSeer]
([
http://citeseer.ist.psu.edu/rd/0%2C63811%2C1%2C0.25%2CDownload/http://cites…
PostScript],
[
http://citeseer.ist.psu.edu/rd/0%2C63811%2C1%2C0.25%2CDownload/http://cites…
PDF], [
http://citeseer.ist.psu.edu/cachedpage/63811/1 Image]) </ref>
Jimbo suggested a similar system:
:'But, we could do something like: allow non-logged in posts, and
allowed posts with Tor *for trusted accounts*, but not non-logged-in
posts with Tor, and not logged-in-but-not-yet-trusted accounts with
Tor.
:'Still, there's a flaw: this means you have to come around to
Wikipedia in an non-Tor manner long enough for us to trust you, which
pretty much blows the whole point of privacy to start with.' <ref
name="jimbotrust">{{cite web
|url=
http://archives.seul.org/or/talk/Sep-2005/msg00292.html
|title= Re: Hello directly from Jimbo at Wikipedia
|accessdate= 2007-06-28
|last= Wales
|first= Jimmy
|authorlink= Jimmy Wales
|date= [[2005-09-27]]
|publisher= or-talk Tor mailing list
|quote=
}}</ref>
The above fills in the question of how to become trusted without
sacrificing privacy.
Problem: ipblock-exempt is only available to admins right now.<ref
name="bug3706">
http://bugzilla.wikimedia.org/show_bug.cgi?id=3706
</ref><ref
name="mediawikirev18904">http://svn.wikimedia.org/viewvc/media…
The code to add a separate group for ipblock-exempt, and even allow
bureaucrats or admins to add people to this group, has been written,
but the Wikimedia developers do not want to commit it. Basically, the
developers want a more elegant solution that will make the userrights
interface more modular in general, rather than a simple hack for this
one group.<ref name="bug9862">{{cite web
|url=
http://bugzilla.wikimedia.org/show_bug.cgi?id=9862
|title= Bug 9862 - Separate group for ipblock-exempt on en.wikipedia
|accessdate= 2007-06-21
|author= Armed Blowfish, Rob Church, Martinp23, Simetrical, ^demon
|year= 2007
|publisher= Wikimedia Bugzilla
|quote=
}}</ref><ref name="bug6711">{{cite web
|url=
http://bugzilla.wikimedia.org/show_bug.cgi?id=6711
|title= Bug 6711 - More modular userrights interface
|accessdate= 2007-06-21
|author= Simetrical, Rotem Liss, Titoxd, Max Semenik, et. al
|publisher= Wikimedia Bugzilla
|quote=
}}</ref> While Wikipaedia could theoretically make people they want
to unblock admins,<ref name="ipblock-exempt-wikitech">{{cite web
|url=
http://www.gossamer-threads.com/lists/wiki/wikitech/79496
|title= New ipblock-exempt permission
|accessdate= 2007-06-22
|author= Andrew, Jepe, Simetrical and Nospam
|year= 2007
|month= January
|publisher= Wikitech Mailing List
|quote=
}}</ref> it is [[Wikipedia:Requests for
adminship/Armedblowfish|highly]] [[Wikipedia:Requests for
adminship/CharlotteWebb|unlikely]] that any Tor users, with the
possible exception of Chinese ones, will be trusted by the community
enough to attain adminship. Also note that Jimbo has said that
ipblock-exempt, if it were available, would not cost money.<ref
name="jimbonomoney">{{cite web
|url=
http://archives.seul.org/or/talk/Sep-2005/msg00387.html
|title= Re: Hello directly from Jimbo at Wikipedia
|accessdate= 2007-06-28
|last= Wales
|first= Jimmy
|authorlink= Jimmy Wales
|date= [[2005-09-30]]
|publisher= or-talk Tor mailing list
|quote=
}}</ref>
If community feeling on the matter were different, an interesting
criterion for a Tor user or exit node operator, with no contributions
outside of his or her talk page, might be the old 'One Featured
Article'. See [[User:Miborovsky/1FA]], [[User:Jguk/admin criterion]],
and the <span
class="plainlinks">[{{fullurl:Special%3ALog|type=delete&user=&page=User%3AMailer+diablo%2FOne+Featured+Article}}
deletion log of User:Mailer diablo/One Featured Article].</span>
Given the difficulty someone unable to edit outside of their talk page
might have writing a featured article, this question is probably
academic, but: Given that there is no way to individually unblock a
Tor user or exit node operator besides granting adminship, should that
Tor user or exit node operator be granted adminship if he or she
writes a featured article on his or her talk page? Since the
human-effort cost of writing a featured article is far higher than the
human effort cost of maintaining two separate IP addresses/ranges, I
would say yes.
== Sybil attack prevention and detection ==
''To prevent most Sybil (sockpuppet) attacks, make creating accounts
time-consuming; to detect the persistent ones, use writing and
behavioural analysis.''
Defenses against [[Sybil attack]]s fall in two categories - prevention
and detection. Traditional Sybil attack prevention lies in increasing
the cost of nym creation, [[#Increasing_the_cost_of_nym_creation |as
described above]]. The higher the
[[#Increasing_the_human-effort_of_nym_creation |human-effort]] cost of
nym creation, the less Sybils an attacker will have time to create.
However, as not all people have the same amount of time, this does
have limitations. According to Judith S. Donath of MIT Media Lab,
'One can have, some claim, as many electronic personas as one has time
and energy to create.'<ref name="donath1">{{cite web
|url =
http://smg.media.mit.edu/people/Judith/Identity/IdentityDeception.html
|title = Identity and deception in the virtual community
|accessdate = 2007-07-09
|last = Donath
|first = Judith S.
|date = [[1996-11-12]]
|work = Communities in cyberspace
|publisher = Berkeley: University of California Press
|pages = 1
|quote =
}} Also available on
[
http://citeseer.ist.psu.edu/donath97identity.html CiteSeer]
([
http://citeseer.ist.psu.edu/rd/0%2C63811%2C1%2C0.25%2CDownload/http://cites…
PostScript],
[
http://citeseer.ist.psu.edu/rd/0%2C63811%2C1%2C0.25%2CDownload/http://cites…
PDF], [
http://citeseer.ist.psu.edu/cachedpage/63811/1 Image]) </ref>
One could attempt to verify that each online persona represents a
different actual person by demanding legal ID. However, this would
this exclude about one third of the world's population - over two
billion people - according to UNICEF. And not an equally distributed
on third, either, but a one third mostly living in areas which are
already underrepresented on Wikipaedia, hence increasing the
[[Wikipedia:WikiProject Countering systemic bias |systemic bias]]
problem.<ref name="UNICEF1">{{cite news
| title = Birth registration: The 'first' right
| url =
http://www.unicef.org/pon98/civil1.htm
| work = The Progress of Nations 1998
| publisher = [[UNICEF]]
| date =
| accessdate = 2007-07-09
| quote = Every year, about 40 million babies -- one third of all
births -- go unregistered around the world.
}}</ref><ref name="UNICEF2">{{cite news
| title = Millions are 'missing'
| url =
http://www.unicef.org/pon98/civil2.htm
| work = The Progress of Nations 1998
| publisher = [[UNICEF]]
| date =
| accessdate = 2007-07-09
| quote = The obstacles to registration are often banal, the product
of misplaced priorities and bureaucratic inadequacies. Poor and rural
countries tend to have lower registration rates, struggling as they
must to cope with the inevitable shortages of trained personnel and
modern technology, the logistical problems of travelling to registry
offices and ignorance or fear of the process. As a result, birth
registration lags in countries such as Sierra Leone, which has a
registration rate of less than 10 per cent; Zimbabwe, with around one
third registered; and Bolivia, where about half the people have a
birth certificate.
}}</ref> It would discourage many others, defeating the point of
privacy far more than an IP address. Not to mention [[identity
document forgery]]. Besides, just because someone has a legal name
does not tell us anything about who they really are.
: ' 'Tis but thy name that is my enemy;
: Thou art thyself, though not a Montague.
: What's Montague? it is nor hand, nor foot,
: Nor arm, nor face, nor any other part
: Belonging to a man. O, be some other name!
: What's in a name? that which we call a rose
: By any other name would smell as sweet;
: So Romeo would, were he not Romeo call'd,
: Retain that dear perfection which he owes
: Without that title. Romeo, doff thy name,
: And for that name which is no part of thee
: Take all myself.'
: — William Shakespeare <ref name="shakespeare">{{cite book
| last = Shakespeare
| first = William
| authorlink = William Shakespeare
| title = Romeo and Juliet
| origyear = Around 1597
| url =
http://shakespeare.mit.edu/romeo_juliet/
| accessdate = 2007-07-09
| location = England
| chapter = Act II, Scene II
| chapterurl =
http://shakespeare.mit.edu/romeo_juliet/romeo_juliet.2.2.html
| quote =
| ref =
}}</ref>
'That which we call a rose by any other name would smell as sweet'.
Even without connecting online personas to real entities, online
personas still possess recognisable qualities which alert us to
Sybils. Hence, writing and behavioural analysis. Wikipaedia already
does this, as explained in the
[[#Current_Sybil_attack_detection_practice_on_Wikipaedia |next
section]]. <!--
http://archives.seul.org/or/talk/Jan-2005/msg00115.html -->
<span id="Current_Sybil_attack_detection_practice"></span><span
id="Current_misused_sockpuppet_detection_practice"></span>
== Current Sybil attack detection practice on Wikipaedia ==
''IPs alone have never been enough to detect Sybils; Wikipaedia has
always relied on writing and behavioural analysis.''
Many users can be blocked without know if they are Sybils, so long as
we know they are harming Wikipaedia. IP addresses don't correspond
one-to-one to people, but Wikipaedia doesn't need to ask ever pair of
editors who live together, use the same public facilities, share the
same ISP, or live in the same geographic region to prove that they
are, in fact, separate people. Wikipaedia only [[WP:RFCU |performs
checkuser]] after there is suspicion - two editors sound like the same
person. Many times, Wikipaedia can [[WP:SSP |reach a conclusion]] on
this without ever asking a checkuser.
<span
id="Figuring_out_whether_or_not_Alice_and_Alex_are_the_same_person.2C_without_knowing_who_in_the_world_Alice_and_Alex_are"></span>
== Figuring out whether pseudonymous identities are Sybils, without
knowing who in the world they are ==
''Writing and behavioural analysis based Sybil detection still works
on Tor users, and should be enough most of the time, especially if you
also make it more time-consuming to create accounts.''
As noted above, you all already do this much of the time. A user
known to be using Tor may reasonably expect Wikipaedia to do the best
[[WP:SSP|non-IP Sybil investigation]] that is possible before worrying
about IP evidence, out of respect for the user's privacy. However,
writing analysis can sometimes be inconclusive. However, until the
non-IP Sybil investigation is inconclusive, we don't need to worry
about the lack of available IP evidence. Tor-using editors who are
never suspected, or who can be determined to be or not be Sybils based
on writing analysis, need never be asked to reveal their IP address.
Note that if you increase the cost of nym creation through Tor, Tor
will not be particularly attractive to Sybil, and the level of human
effort cost required to conduct subtle misuse of Sybils is greater
than the human effort cost required to change IPs.
=== When that fails... ===
''On the rare occasion that writing and behavioural analysis is not
enough to determine if a Tor user is a Sybil, it is reasonable to ask
the Tor user for their IP then, but not before.''
If we really cannot figure out if an editor is misusing Sybils without
the aid of IP evidence (which is probabilistic anyway), it is
reasonable to grant that user a choice:
* Remain pseudonymous, but consent to allow Wikipaedia to assume that
the editor is a Sybil. (What we have been trying to avoid, but a
necessary fail-safe for the protection of the editor's pseudonymity
and Wikipaedia's integrity.)
* Voluntarily share their IP address with a checkuser. Wikipaedia can
try to do a number of things to make this option more palatable:
** By not asking until this point, Wikipaedia demonstrates respect for
the editor's privacy, which may make the editor more trusting.
** By not asking until this point, the editor may have a chance to
establish a trusting relationship with a checkuser prior to being
asked to reveal their IP.
** The editor may ask that their IP only be revealed to one checkuser,
and that the checkuser not share the information with other
checkusers.<ref name="edda">{{cite book
| others = Translated by Olive Bray
| title = [[Poetic Edda]]
| origdate = No earlier than about [[985]]
| accessdate = 2007-06-19
| language = English
| chapter = Hávamál, Wisdom for Wanderers and Counsel to Guests
| chapterurl =
http://www.pitt.edu/~dash/havamal.html#wanderers
| quote = Each man who is wise and would wise be called <br />
must ask and answer aright. <br />
Let one know thy secret, but never a second, -- <br />
if three a thousand shall know.
}}</ref> This is not a problem, we trust our checkusers to make
these determinations by themselves.
** The editor may and probably should ask that the checkuser provide a
[[GNU Privacy Guard|GPG]] public key, preferably one which has been
verified to some extent to belong to the checkuser, such that when the
editor send the checkuser their IP address, no one else will be able
to intercept the message. (Unencrypted communication is the
equivalent of a postcard: anyone who can intercept it can read it, and
no one will ever know the message was intercepted.)
** The editor may ask that the checkuser simply make a determination
on the probability that the editor is a Sybil, without providing any
reasoning, e.g. no disclosure of the editor's geographic location.
It's okay, we trust checkusers to make this determination.
In return for heightened IP security, Wikipaedia may reasonably ask
the editor for whatever confirmation that their IP address is theirs
is possible. Some possibilities:
* A screenshot of a website, like [
http://www.ip-adress.com/
ip-address.com], that says your external IP address. (Advantages:
Very easy. Weaknesses: Image manipulation, not helpful for Tor exit
node operators.)
* An e-mail sent from an ISP e-mail or directly from one's computer.
(Advantages: Still easy, a bit more secure than a screenshot, is
useful for Tor exit node operators. Disadvantages: [[SMTP]] is not
secure.)
* Give the checkuser a temporary [[OpenSSH]] account on the computer.
(Advantages: Securely proves that the editor has high enough access on
the computer to create accounts, making this a fairly sure method,
even if the editor is running a Tor exit node. Disadvantages: Harder
than other methods, requires a *nix operating system, editor may be
uncomfortable granting an account to someone else, or opening their
SSH port. Also not helpful to people using public computers.)
* It may also be helpful for an editor with a dynamic IP address to
give the checkuser a domain or subdomain name linked to that dynamic
IP address, if they have one. [
http://dyndns.com DynDNS] provides
such subdomains for free.
== Making sure that Alice is indeed Alice ==
''Tor users are a bit more vulnerable to having their passwords
sniffed over unencrypted login; solutions involve encryption.''
If you log in via [
http://en.wikipedia.org/wiki/Special:Userlogin
en.wikipedia.org/wiki/Special:Userlogin], you are sending your user
name and password over an unencrypted connection, which is the
equivalent of a post card - anyone who can intercept it can read it,
no trouble. Tor users may be more vulnerable since Tor only encrypts
up to the exit node, and they use so many circuits, with various
different exit nodes<!-- Everything from Alice to her exit node is
encrypted (at various levels); it is not apparent how exit node to
Wikipedia would be likely to entail more hops. Saying that Tor is
likely to be monitored either by hostile parties running Tor nodes or
outside forces is more likely true, but the governments are spying on
everyone anyways... (Carnivore and EarthLink; EFF, NSA, AT&T; etc.)
For non-government entities I would be much more worried about people
using wireless at Wikimania than someone stealing your password via
Tor. — The preceding comment added by Kotepho.
The problem is, Alice keeps switching exit nodes, every 30 seconds to
half hour, usually about every 5 minutes. So if she uses 100
different exit nodes, that's 100 chances that the exit node her data
goes through might be logging... not to mention each exit node has a
different path between it and the Wikimedia servers, increasing the
number of data paths, each of which might have someone sniffing
packets. — Armed Blowfish
You are only going to autheticate with your password once per session;
all other authetication is done with the edit token. They would be
able to do most actions with the account—everything except setting a
new password at a cursory glance as that requires the old password,
but they could set a new email(why not require the password to set a
new email too?) and have a new password emailed to them, and then log
in with the temporary password and set a new one and take over the
account (reading the source in viewvc is non-optimal, but I do not see
the cases were the edittoken is invalidated other than confirming your
email and changing your password, not even on logout?)—but they do not
get a chance at /your/ password more than once per log in attempt.
There is also the possibility of active rather than passive attacks
such as faking a log out or actually logging you out to get your
password. It does indeed present some problems with making sure Alice
is Alice for a specific action ([[non-repudiation]], essentially), but
not after the fact (at least once there is some way for someone to
revoke your edittoken and no way to change the password without the
password, c.f. the compromised accounts recently proving they were the
rightful owners so they could get rights back). [Sorry if I'm being
too pedantic.] — Kotepho
Right, but each login a Tor user may be using a different exit node...
in Sweden, then the United States, then Russia, then the United
Kingdom, then Canada... and on, different exit nodes from all around
the world at each login. (Well, there are some repeats, but still.)
— Armed Blowfish
-->, allowing for more possible points of interception. (If Alice
logs in over 100 different exit nodes, that is at least 100 different
paths from exit node to the Wikimedia Foundation that might be
[[packet sniffing|sniffed]].)
[
https://secure.wikimedia.org/wikipedia/en/wiki/Special:Userlogin
secure.wikimedia.org] is encrypted, but too many users using it could
overload the server, not to mention that [[Transport Layer
Security|TLS]] is far from perfect. A hidden service would be
fantastic, but this may not be cost-effective for the Wikimedia
Foundation.<ref name="dipierro1">{{cite web
|url=
http://archives.seul.org/or/talk/Oct-2005/msg00228.html
|title= Wikipedia and Tor - a solution in the works?
|accessdate= 2007-06-21
|last= DiPierro
|first= Anthony
|date= [[2005-10-29]]
|publisher= or-talk Tor mailing list
|quote=
}}</ref><ref name="tiwaz">{{cite web
|url=
http://archives.seul.org/or/talk/Oct-2005/msg00246.html
|title= RE: Wikipedia and Tor - a solution in the works?
|accessdate= 2007-06-21
|last= tiwaz
|first= loki
|date= [[2005-10-31]]
|publisher= or-talk Tor mailing list
|quote=
}}</ref>
You could ask Tor users to provide a [[GNU Privacy Guard|GPG]]
[[Public/private key cryptography|public key]], which would allow them
to confirm that they are the same person as they were before in the
future. Note that this would not prevent their accounts from being
hijacked, only provide a way for them to confirm their identity to
another Wikipaedian. This has the benefit of increasing the cost of
nym creation - although anyone can generate a GPG public-private key
pair, it takes time to do this. Note that Freenode already requires
this from Tor users as a condition for using the better of their two
hidden services - the one that does not get banned during periods of
general Tor misuse.<ref name="freenode">{{cite web
|url=
http://freenode.net/irc_servers.shtml#tor
|title= Accessing Freenode Via Tor
|accessdate= 2007-06-21
|year= (c) 2002-2007
|work= IRC Servers
|publisher= Peer-Directed Projects Center
|quote=
}}</ref>
Note that some people may enjoy the plausible deniability granted by a
weak authentication system.<ref name="dipierro2">{{cite web
|url=
http://archives.seul.org/or/talk/Oct-2005/msg00248.html
|title= Re: Wikipedia and Tor - a solution in the works?
|accessdate= 2007-06-20
|last= DiPierro
|first= Anthony
|date= [[2005-10-31]]
|publisher= or-talk Tor mailing list
|quote= The reality is that it's not that huge of a deal if someone
finds out my Wikipedia password. In fact, in some ways it's a feature
in that I can repudiate any edit made using my account.
}}</ref>
== When pseudonymity isn't anonymous enough ==
''Pseudonyms may present a dangerous privacy risk for some Tor users,
but there is not much we can do besides invite them to edit Tor IP
talk pages.''
Pseudonymity is by nature less anonymous that anonymity. If you can
say "this anonymous person is the same as that anonymous person", then
if you ever figure out who the anonymous person is, you can hold them
accountable for all of their pseudonymous actions. This may be
dangerous for some people.<ref name="eward">{{cite web
|url=
http://archives.seul.org/or/talk/Oct-2005/msg00002.html
|title= Re: Hello directly from Jimbo at Wikipedia
|accessdate= 2007-06-21
|last= Eward
|first= Dustin
|date= [[2005-10-01]]
|publisher= or-talk Tor mailing list
|quote=
}}</ref> However, you can't very well authenticate yourself as
trusted if you are not willing to establish a pseudonym. Therefore, I
am not sure there is much we can do for these people besides invite
them to place {{tl|helpme}} requests on various Tor exit node IP talk
pages.
== If you all are going hardblock Tor, you should really do a better
job of it ==
''Wikipaedia's current inaccurate Tor blocking methods are bad for
both Wikipaedia and Tor, but the thoughtful Tor developers have
provided several solutions.''
Tor is a highly dynamic network - the IPs addresses of exit nodes
change frequently. In addition, exit nodes have individual exit
policies. Wikipaedia's current blocking methods result in a number of
false positives and false negatives. Exit nodes which do not exit to
Wikipaedia are blocked, as are IPs that used to be exit nodes, while
other exit nodes are missed. The latter is a problem since any Tor
user who can edit a config file can pick which exit nodes they want or
don't want to use, making the current hardblock on Tor quite
circumventable by a user determined to do so.<ref
name="faq-abuse-bans">{{cite web
|url=
http://tor.eff.org/faq-abuse.html.en#Bans
|title= Abuse FAQ for Tor Server Operators: I want to ban the Tor
network from my service
|accessdate= 2007-06-20
|date= [[2007-06-17]]
|publisher= The Tor Project
|quote=
}}</ref><ref name="config">{{cite web
|url=
http://cvs.seul.org/viewcvs/viewcvs.cgi/tor/trunk/src/config/torrc.complete…
|title= Sample Tor configuration file: torrc.complete.in (rev 10570)
|accessdate= 2007-06-20
|date= [[2007-06-12]]
|publisher= The Tor Project
|quote=
}}</ref>
There is no reason this should be a problem. The Tor developers
respect Wikipaedia's self-determination to block exit nodes, and have
even written programs to help sites like Wikipaedia do this as
accurately as possible. Nick Mathewson explains this,
: 'We're okay with subverting entry-blocks. This isn't hypocrisy;
this is because entry blocks are fundamentally different. When Alice
connects to Tor to connect to Bob, an exit block means that Bob
doesn't want anonymous connections, whereas an entry block means that
somebody doesn't want Alice to have privacy. Entry blocking subverts
Alice's self-determination, whereas exit blocking on Bob's part *is*
self-determination, even if we don't like it.'<ref name="nickm1"
/>
The Tor developers offer two good ways to do this:
[
http://exitlist.torproject.org/ TorDNSEL] and the Python
[
http://cvs.seul.org/viewcvs/viewcvs.cgi/tor/trunk/contrib/exitlist?rev=1040…
exitlist]. Although TorDNSEL is better, or so the Tor developers told
me, I have not actually read
[
http://tor.eff.org/svn/trunk/doc/contrib/torel-design.txt the
documentation] yet.
Anyway, the Python exitlist script offers the following advantages:
* Since it uses the same data the Tor client uses, it is as up-to-date
as the client data. (Note that this means you have to actually run
Tor in order to get the data.)
* Accurately parses exit policies so only exit nodes that actually
exit to Wikipaedia are listed.
* Minimal false negatives and false positives. Yes, it can get
slightly out of date, it can only resist as many bad authorities as
the Tor client itself, and it can't get all of the multi-homed exit
node IPs, but it is better than the other lists you use. Regarding
the multi-homed exit nodes,
[
http://peertech.org/pub/tor-dissim-p80-exits.txt this list] gives the
real exiting IPs of such exit nodes which exit on port 80, rather than
the advertised IPs.
* It generates a nice, happy, easily parsable list of IPs with one IP
address per line. This would be great if the community were willing
to give adminship to a bot such as [[Wikipedia:Requests for
adminship/TawkerbotTorA|TawkerbotTorA]].
For your convenience, I have included the exitlist script with the
minor modifications that it looks for nodes exiting to Wikipaedia and
includes the script's license.
On *nix, or at least [[OpenBSD]] (probably other *nixes too, but I
tested on OpenBSD), it runs with the following command:
<pre>
cat ~/.tor/cached-routers* | python exitlist > torexitwikipaedia
</pre>
I don't know what works on Windows.
=== Modified exitlist code ===
<pre>
#!/usr/bin/python
# Copyright 2005-2006 Nick Mathewson
# See the LICENSE file in the Tor distribution for licensing information.
# The following licensing information copied by Armed Blowfish
# You can find the updated exitlist code here:
#
http://tor.eff.org/svn/trunk/contrib/exitlist
# You can find the Tor license here:
#
http://tor.eff.org/svn/trunk/LICENSE
# Relevant portion of Tor license file:
# Tor is distributed under this license:
# Copyright (c) 2001-2004, Roger Dingledine
# Copyright (c) 2004-2007, Roger Dingledine, Nick Mathewson
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are
# met:
# * Redistributions of source code must retain the above copyright
# notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above
# copyright notice, this list of conditions and the following disclaimer
# in the documentation and/or other materials provided with the
# distribution.
# * Neither the names of the copyright owners nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
# End info copied by Armed Blowfish. When updating to the latest
# version of the exit list, don't forget to include licensing information
# and make ouput Wikipaedia-specific.
# Requires Python 2.2 or later.
"""
exitlist -- Given a Tor directory on stdin, lists the Tor servers
that accept connections to given addreses.
example usage (Tor 0.1.0.x and earlier):
python exitlist 18.244.0.188:80 < ~/.tor/cached-directory
example usage (Tor 0.1.1.10-alpha and later):
cat ~/.tor/cached-routers* | python exitlist 18.244.0.188:80
If you're using Tor 0.1.1.18-rc or later, you should look at
the "FetchUselessDescriptors" config option in the man page.
Note that this script won't give you a perfect list of IP addresses
that might connect to you using Tor, since some Tor servers might exit
from other addresses than the one they publish.
"""
#
# Change this to True if you want more verbose output. By default, we
# only print the IPs of the servers that accept any the listed
# addresses, one per line.
#
VERBOSE = False
#
# Change this to True if you want to reverse the output, and list the
# servers that accept *none* of the listed addresses.
#
INVERSE = False
#
# Change this list to contain all of the target services you are interested
# in. It must contain one entry per line, each consisting of an IPv4 address,
# a colon, and a port number. This default is only used if we don't learn
# about any addresses from the command-line.
# Made this part Wikipaedia-specific. -- Armed Blowfish
# 66.230.200.100:80 is
en.wikipedia.org
# 66.230.200.219:443 is
secure.wikimedia.org
ADDRESSES_OF_INTEREST = """
66.230.200.100:80
66.230.200.219:443
"""
#
# YOU DO NOT NEED TO EDIT AFTER THIS POINT.
#
import sys
import re
import getopt
import socket
import struct
import time
assert sys.version_info >= (2,2)
def maskIP(ip,mask):
return "".join([chr(ord(a) & ord(b)) for a,b in zip(ip,mask)])
def maskFromLong(lng):
return struct.pack("!L", lng)
def maskByBits(n):
return maskFromLong(0xffffffffl ^ ((1L<<(32-n))-1))
class Pattern:
"""
>> import socket
>> ip1 = socket.inet_aton("192.169.64.11")
>> ip2 = socket.inet_aton("192.168.64.11")
>> ip3 = socket.inet_aton("18.244.0.188")
>> print Pattern.parse("18.244.0.188")
18.244.0.188/255.255.255.255:1-65535
>> print
Pattern.parse("18.244.0.188/16:*")
18.244.0.0/255.255.0.0:1-65535
>> print
Pattern.parse("18.244.0.188/2.2.2.2:80")
2.0.0.0/2.2.2.2:80-80
>> print
Pattern.parse("192.168.0.1/255.255.00.0:22-25")
192.168.0.0/255.255.0.0:22-25
>> p1 =
Pattern.parse("192.168.0.1/255.255.00.0:22-25")
>> import socket
>> p1.appliesTo(ip1, 22)
False
>> p1.appliesTo(ip2, 22)
True
>> p1.appliesTo(ip2, 25)
True
>> p1.appliesTo(ip2, 26)
False
"""
def __init__(self, ip, mask, portMin, portMax):
self.ip = maskIP(ip,mask)
self.mask = mask
self.portMin = portMin
self.portMax = portMax
def __str__(self):
return "%s/%s:%s-%s"%(socket.inet_ntoa(self.ip),
socket.inet_ntoa(self.mask),
self.portMin,
self.portMax)
def parse(s):
if ":" in s:
addrspec, portspec = s.split(":",1)
else:
addrspec, portspec = s, "*"
if addrspec == '*':
ip,mask = "\x00\x00\x00\x00","\x00\x00\x00\x00"
elif '/' not in addrspec:
ip = socket.inet_aton(addrspec)
mask = "\xff\xff\xff\xff"
else:
ip,mask = addrspec.split("/",1)
ip = socket.inet_aton(ip)
if "." in mask:
mask = socket.inet_aton(mask)
else:
mask = maskByBits(int(mask))
if portspec == '*':
portMin = 1
portMax = 65535
elif '-' not in portspec:
portMin = portMax = int(portspec)
else:
portMin, portMax = map(int,portspec.split("-",1))
return Pattern(ip,mask,portMin,portMax)
parse = staticmethod(parse)
def appliesTo(self, ip, port):
return ((maskIP(ip,self.mask) == self.ip) and
(self.portMin <= port <= self.portMax))
class Policy:
"""
>> import socket
>> ip1 = socket.inet_aton("192.169.64.11")
>> ip2 = socket.inet_aton("192.168.64.11")
>> ip3 = socket.inet_aton("18.244.0.188")
>> pol = Policy.parseLines(["reject
*:80","accept 18.244.0.188:*"])
>> print str(pol).strip()
reject 0.0.0.0/0.0.0.0:80-80
accept 18.244.0.188/255.255.255.255:1-65535
>> pol.accepts(ip1,80)
False
>> pol.accepts(ip3,80)
False
>> pol.accepts(ip3,81)
True
"""
def __init__(self, lst):
self.lst = lst
def parseLines(lines):
r = []
for item in lines:
a,p=item.split(" ",1)
if a == 'accept':
a = True
elif a == 'reject':
a = False
else:
raise ValueError("Unrecognized action %r",a)
p = Pattern.parse(p)
r.append((p,a))
return Policy(r)
parseLines = staticmethod(parseLines)
def __str__(self):
r = []
for pat, accept in self.lst:
rule = accept and "accept" or "reject"
r.append("%s %s\n"%(rule,pat))
return "".join(r)
def accepts(self, ip, port):
for pattern,accept in self.lst:
if pattern.appliesTo(ip,port):
return accept
return True
class Server:
def __init__(self, name, ip, policy, published, fingerprint):
self.name = name
self.ip = ip
self.policy = policy
self.published = published
self.fingerprint = fingerprint
def uniq_sort(lst):
d = {}
for item in lst: d[item] = 1
lst = d.keys()
lst.sort()
return lst
def run():
global VERBOSE
global INVERSE
global ADDRESSES_OF_INTEREST
if len(sys.argv) > 1:
try:
opts, pargs = getopt.getopt(sys.argv[1:], "vx")
except getopt.GetoptError, e:
print """
usage: cat ~/.tor/cached-routers* | %s [-v] [-x] [host:port [host:port [...]]]
-v verbose output
-x invert results
""" % sys.argv[0]
sys.exit(0)
for o, a in opts:
if o == "-v":
VERBOSE = True
if o == "-x":
INVERSE = True
if len(pargs):
ADDRESSES_OF_INTEREST = "\n".join(pargs)
servers = []
policy = []
name = ip = None
published = 0
fp = ""
for line in sys.stdin.xreadlines():
if line.startswith('router '):
if name:
servers.append(Server(name, ip, Policy.parseLines(policy),
published, fp))
_, name, ip, rest = line.split(" ", 3)
policy = []
published = 0
fp = ""
elif line.startswith('fingerprint') or \
line.startswith('opt fingerprint'):
elts = line.strip().split()
if elts[0] == 'opt': del elts[0]
assert elts[0] == 'fingerprint'
del elts[0]
fp = "".join(elts)
elif line.startswith('accept ') or line.startswith('reject '):
policy.append(line.strip())
elif line.startswith('published '):
date = time.strptime(line[len('published '):].strip(),
"%Y-%m-%d %H:%M:%S")
published = time.mktime(date)
if name:
servers.append(Server(name, ip, Policy.parseLines(policy), published,
fp))
targets = []
for line in ADDRESSES_OF_INTEREST.split("\n"):
line = line.strip()
if not line: continue
p = Pattern.parse(line)
targets.append((p.ip, p.portMin))
# remove all but the latest server of each IP/Nickname pair.
latest = {}
for s in servers:
if (not latest.has_key((s.fingerprint))
or s.published > latest[(s.fingerprint)]):
latest[s.fingerprint] = s
servers = latest.values()
accepters, rejecters = {}, {}
for s in servers:
for ip,port in targets:
if s.policy.accepts(ip,port):
accepters[s.ip] = s
break
else:
rejecters[s.ip] = s
# If any server at IP foo accepts, the IP does not reject.
for k in accepters.keys():
if rejecters.has_key(k):
del rejecters[k]
if INVERSE:
printlist = rejecters.values()
else:
printlist = accepters.values()
ents = []
if VERBOSE:
ents = uniq_sort([ "%s\t%s"%(s.ip,s.name) for s in printlist ])
else:
ents = uniq_sort([ s.ip for s in printlist ])
for e in ents:
print e
def _test():
import doctest, exitparse
return doctest.testmod(exitparse)
#_test()
run()
</pre>
=== An old patch ===
Adam Langley wrote a patch that would make it easier to play temporary
blocking games with Tor, just as has been done with AOL in the past.
I don't know the details, apparently it needed a few fixes from
someone familiar with the Wikimedia source code, but in any case it
never got committed.<ref
name="torblockpatch">http://www.imperialviolet.org/binary/medi…
</ref><ref
name="torblockpatch-ortalk">http://archives.seul.org/or/talk/M…
</ref><ref
name="torblockpatch-ortalk-alreadycoded">http://archives.seul.…
== Works cited ==
<!-- Yes, I know very well that many of these do not count as
[[WP:RS|reliable sources]]. That's okay, I'm not writing an article.
If, for example, you question whether Jimbo is actually the author of
some of these mailing list messages, ask him. -->
<-references />