On 23/08/07, Armed Blowfish diodontida.armata@googlemail.com wrote:
On 17/08/07, madman bum and angel madman@ferretproductions.com wrote:
Luna wrote:
For clarification, does this mean that discussion regarding AB in particular, or discussion on proxies in general? I'm of the opinion that
the
former issue was distracting from productive resolution of the greater matter at hand: trying to find a better working solution to the proxies problem. I'd say more, but want to wait for that clarification.
Thanks for re-focusing us. I'm working on a solution myself; I hope to be able to post a proposal for review within a couple days.
-madman bum and angel
I tried 're-focus' this a long time ago.
To quote myself, 'I asked for my RfA to be blanked for a reason. Really, if you want to argue for or against allowing Tor editors, that's fine, but could my RfA please be left out of it?' (13 August 2007 10:20)
'To clarify on that, consider good Tor users and exit node operators who have never contributed to Wikipaedia. They cannot be said to have violated policy, since they have obeyed it by not editing, either when most of Tor was softblocked, or by evading Tor blocks while most of Tor has been hardblocked. (Well, unless you want to say that exit node operators allowing exits to Wikipaedia are 'violating policy' by doing so... why some people think Tor exit policies are in Wikipaedia's jurisdiction, I don't know....)
It would be nice if those Tor users and exit node operators could edit, after being authenticated as trusted. On the Tor IRC channel, Wikipaedia is complained about more than any other site, by polite individuals. However, I myself have no interest in getting unbanned/unblocked/whatever, and said RfA is a source of distress for me, so it would be nice if you could leave that out of the debate.' (13 August 2007 10:40)
But apparently no one reads what I write.
Actually, in case you are interested, I tried to suggest things which would be helpful for other Tor users but not for me a long long time ago.
This was on my talk page before I asked it to go poof. If you are interested, I can add some information about honesty and deception in the animal world, and how that relates to Wikipaedia.
Sorry for the formatting, paste it into Wikipaedia and hit preview to look at it, I guess.
------------------------------------------------------------------------------------------------
== Purpose of blocking Tor == ''Tor is blocked for economic reasons, because the high quantity of destructive edits that could come through it aren't worth putting up with.''
The main purpose of blocking [[Tor (anonymity network)|Tor]] is to reduce the quantity of destructive edits to Wikipaedia. Certainly, Wikipaedia gets destructive edits from other places as well, but open proxies seem to be used for that sort of thing more often than other IP addresses. The complex Sybil attacks on Wikipaedia are fairly unique, but Wikipaedia is in a better position to defend against them than many other systems. However, blocking Tor does result in collateral damage of editors who mean well, and there are other ways of achieving the purpose of protecting Wikipaedia, which is what the rest of this essay examines.
Some quotes:
: 'One of the things we currently do is block Tor. I consider that a reasonable solution to the vandalism problem, but an unfortunate thing, since to my mind, Tor is something very good.
: 'It would be nice if we could look at the edits coming from Tor and say "Oh, these are fine, they are mostly responsible edits." It'd even be ok if we could look at the edits coming from Tor and say "Ok, so there's a touch more vandalism from these than from other ip pools, but there's also some good stuff coming through from places where we normally don't see a lot of editing activity. We'll put up with it." : 'As it is now, we look at it and say "oh, jesus".' : — Jimbo Wales <ref name="jimbo1">{{cite web |url= http://archives.seul.org/or/talk/Sep-2005/msg00353.html |title= Re: Hello directly from Jimbo at Wikipedia |accessdate= 2007-06-18 |last= Wales |first= Jimmy |authorlink= Jimmy Wales |date= [[2005-09-29]] |publisher= or-talk Tor mailing list |quote= }}</ref>
: 'Hey folks -- the reason that Wikipedia (and other services) use IPs to block users is not stupidity, laziness, or ignorance. People use IP-based blocking because it limits abuse better than no blocking at all. Blocking IPs is not saying, "I hate privacy, I think IPs do and should map 1:1 to human beings, and abuse is an ISP problem; and Tor doesn't exist." It's saying, "I can't deal with the abuse I'd see if I didn't block some IPs, and while IP blocking is imperfect, it's about as good as any other scheme I have had the time so far to implement."' — Nick Mathewson, Tor developer <ref name="nickm1">{{cite web |url= http://archives.seul.org/or/talk/Sep-2005/msg00260.html |title= Re: Hello directly from Jimbo at Wikipedia |accessdate= 2007-06-18 |last= Mathewson |first= Nick |date= [[2005-09-27]] |publisher= or-talk Tor mailing list |quote= }}</ref>
== The mixed nature of Tor, and disadvantages of blocking it completely == ''Both the well-meaning and the ill-intentioned use Tor - blocking Tor completely prevents not only bad but also good.''
Both helpful and harmful people, as well some some people who are not clearly helpful or harmful, use Tor. In fact, most of the ''really bad'' people don't use Tor. Tor does not provide the best anonymity, but unlike many of the methods criminals use, it does provide legal and ethical anonymity, since the Tor nodes are all volunteers. However, Tor still gets more than its share of (mostly) law-abiding trouble-makers.<ref name="faq-abuse">{{cite web |url= http://tor.eff.org/faq-abuse.html.en#WhatAboutCriminals |title= Abuse FAQ for Tor Server Operators: Doesn't Tor enable criminals to do bad things? |accessdate= 2007-06-18 |date= [[2007-06-17]] |publisher= The Tor Project |quote= }}</ref>
But Tor also has well meaning users. People who use Tor to bypass censorship (consider the [[Wikipedia:WikiProject Countering systemic bias|systemic bias]] implications),<ref name="entranceblockingresistance"> {{cite web |url= http://cvs.seul.org/viewcvs/viewcvs.cgi/*checkout*/tor/trunk/doc/design-pape... |title= Design of a blocking-resistant anonymity system (Draft - revision 10168) |author = Roger Dingledine and Nick Mathewson |accessdate= 2007-06-18 |date= [[2007-05-12]] |publisher= The Tor Project |quote= }} </ref> people who use Tor to get around restrictive firewalls,<ref name="firewalledclient"> {{cite web |url= http://wiki.noreply.org/noreply/TheOnionRouter/TorFAQ#FirewalledClient |title= My firewall only allows a few outgoing ports |work= TheOnionRouter/TorFAQ |accessdate= 2007-06-18 |date= [[2007-06-11]] |publisher= Noreply Wiki |quote= }}</ref> and other random and well-meaning people who want or need privacy.<ref name="toroverview>{{cite web |url= http://tor.eff.org/overview.html.en |title= Tor: Overview |accessdate= 2007-06-18 |date= [[2006-08-22]] |publisher= The Tor Project |quote= }}</ref>
Ideally, we should like to separate the well-meaning users from the bad.
Some more quotes:
: 'Let me tell you what I love. I love the Chinese dissident who wants to work on Wikipedia articles in safety. I love that Wikipedia is an open platform that allows people to have that voice, and that we can have a positive impact on the world in large part because we don't bow to censorship and we are willing to reach out and work with people like Tor to empower individuals to speak, no matter what sort of oppressive conditions they face. : 'WE ARE ON THE SAME SIDE.' : — Jimbo Wales <ref name="jimbo2">{{cite web |url= http://archives.seul.org/or/talk/Sep-2005/msg00253.html |title= Hello directly from Jimbo at Wikipedia |accessdate= 2007-06-18 |last= Wales |first= Jimmy |authorlink= Jimmy Wales |date= [[2005-09-27]] |publisher= or-talk Tor mailing list |quote= }}</ref>
: 'I want there to be a great method for people to be able to edit Wikipedia safely and securely no matter what their personal situation may be, and I want that method to be sufficiently abuse-free that we can allow it.' — Jimbo Wales <ref name="jimbo3">{{cite web |url= http://archives.seul.org/or/talk/Sep-2005/msg00315.html |title= Re: [roy@rant-central.com: Re: [arma@mit.edu: Re: Wikipedia & Tor]] |accessdate= 2007-06-18 |last= Wales |first= Jimmy |authorlink= Jimmy Wales |date= [[2005-09-28]] |publisher= or-talk Tor mailing list |quote= }}</ref>
== Privacy with minimal damage to Wikipaedia? ==
Nick Mathewson, a Tor developer, summarised many of the key points quite nicely during a discussion between Jimbo and the good people of Tor in 2005:
: 'To everybody in this discussion: here are some things that might make you feel better in the short run, but which will ultimately not help. :: - Trying to convince Jimbo that privacy is good, or that reducing false positives would serve wikipedia's goal of openness. He knows. :: - Trying to convince Tor developers that abuse is bad, or that reducing abuse would server Tor's goal of widespread acceptance. We know. :: - Trying to convince Tor developers to subvert services attempts to block Tor exit connections{4} based on IPs. We won't; it would be wrong. :: - Trying to convince Wikipedia operators that privacy is evil _per se_, and should be thwarted regardless of potential for abuse in particular instances. I doubt they'd buy it; it doesn't look like Jimbo will. :: - Trying to convince the world that some wikipedians have an unnuanced view of Tor and anonymity and false-positives. We know. :: - Trying to convince the world that some Tor operators and users have an unnuanced view of IP blocking and abuse. We know.
: 'Here are some things that would be harder, but which would probably be useful: :: - Try to develop better understanding of why and whether abuse prevention mechanism, even the ones you think are crappy, work in practice. :: - If a hypothetical abuse prevention mechanism wouldn't work, explain why not. :: - When it looks like somebody is saying something utterly stupid or insane, try to figure out why, from their point of view, it might seem reasonable to say such a thing.{5} :: - Come up with workable ways to prevent abuse that don't damage privacy or preclude anonymizing layers like Tor. :: - Implement those models, and try them out.
: 'There are probably other helpful things, too.'
: — Nick Matthewson <ref name="nickm1" />
== Internet Protocol (IP) addresses do not map 1:1 to human beings, or even computers == ''IP addresses are not people - they are changed and shared in various ways, even without open anonymising proxies.''
Consider the following situations: * Shared computers - [[public computer]]s such as those in libraries and internet cafes, as well as more private computers shared by people who are living together, such as family members. Note that many of these may be behind a NAT (see later) anyway. * [[Dynamic Host Configuration Protocol|Dynamic IP addresses]] - IPv4 addresses are limited in quantity, and IPv6 is not catching on. So, rather than needing an IP address for each separate client (where a client may be a computer or a NAT, as explained later), an Internet Service Provider (ISP) only needs enough IP addresses for the number of clients that are online at one time. The automated nature of Dynamic Host Configuration Protocol (DHCP) also makes it convenient. Note that there are different degrees of dynamicness - an IP address is assigned for a limited but adjustable period of time, and a DHCP server may or may not try to give a client the same IP address as it had before (note that there are ways of getting the DHCP server to give you a new IP address).<ref name="sun">{{cite web |url= http://www.sun.com/software/whitepapers/wp-dhcp/dhcp-wp.pdf |title= Dynamic Host Configuration Protocol: Technical White Paper |accessdate= 2007-06-18 |year= 2000 |month= August |format= [[PDF]] |publisher= Sun Microsystems, Inc |quote= }} Also see Google's [http://216.239.51.104/search?q=cache:G4oLBWItHb8J:www.sun.com/software/white... HTML cache].</ref> * Overloading or overlapping [[network address translation]] (NAT) - This is another solution to the IPv4 address scarcity problem. Basically, multiple clients share the same IP address, with a router converting packets sent to that IP address to packets sent to various local IP addresses, and vice versa. This can be done at multiple levels - a household, a school, library, or business, or even an entire ISP. (Consider the collateral damage that could be caused by blocking a Tor exit node which is running behind a large NAT.<ref name="">{{cite web |url= http://archives.seul.org/or/talk/Sep-2005/msg00257.html |title= Re: Hello directly from Jimbo at Wikipedia |accessdate= 2007-06-18 |last= Syverson |first= Paul |date= [[2005-09-27]] |publisher= or-talk Tor mailing list |quote= This is potentially a bigger problem than it may appear. On the one hand, services should be allowed to refuse connections from sources of possible abuse. But when a Tor node administrator decides whether he prefers to be able to post to Wikipedia from his IP address, or to allow people to read Wikipedia anonymously through his Tor node, he is making the decision for others as well. (For a while, Wikipedia blocked all posting from all Tor nodes based on IP addresses.) If the Tor node shares an address with a campus or corporate NAT, then the decision can prevent the entire population from posting. This is a loss for both Tor and Wikipedia: we don't want to compete for (or divvy up) the NAT-protected entities of the world. }}</ref>)<ref name="cisco">{{cite web |url= http://www.cisco.com/en/US/tech/tk648/tk361/technologies_tech_note09186a0080... |title= How NAT Works |accessdate= 2007-06-18 |date= [[2006-01-24]] |work= IP Addressing Services |publisher= [[Cisco]] |quote= }}</ref> * [[AOL]] - Think of it like an overloading or overlapping NAT where the public IP addresses keep rotating (dynamic NAT).<ref name="aol">{{cite web |url= http://webmaster.info.aol.com/proxyinfo.html |title= AOL Proxy Info |accessdate= 2007-06-18 |date= [[2005-06-14]] |publisher= America Online, Inc |quote= }}</ref> A good solution to the AOL problem is X-Forwarded-For, but this only works because AOL is not actually trying to protect the privacy of its users. See Meta's [[m:XFF project|XFF project]]. * Things change. People switch ISPs.<ref name="eweek">{{cite web |url= http://www.eweek.com/article2/0,1759,290497,00.asp |title= Need For Speed Drives Customer Churn |accessdate= 2007-06-20 |last= Wetzel |first= Rebecca |date= [[2001-09-07]] |publisher= eWeek |quote= About 24 percent of respondents say they switched ISPs in the past year, compared to about 23 percent in 2000 and 20 percent the year before. About 28 percent say they are extremely or somewhat likely to switch next year. }}</ref> People move.
Unless you are paying for a server-quality internet connection, chances are your IP address is dynamic, you are behind a NAT, or quite possibly both.<ref name="cisco" />
Note that IP addresses [[Internet_privacy#ISPs|can generally be traced back to a particular computer or home/office router, and ISP bill]], but this tends to require either the cooperation of the ISP, or cracking into the ISP, which are not things Wikipaedia does.<ref name="adrian">[[User talk:Adrian|Ask]] [[Adrian Lamo]].</ref>
== IP-based blocking, and why it works better than nothing == ''While IP-based blocking is far from perfect, and could not actually prevent someone from editing, even without open anonymisng proxies, it is economic - at least you force the person to find another IP address or range before he or she can harm Wikipaedia again.''
As we have established [[#Internet_Protocol_.28IP.29_addresses_do_not_map_1:1_to_human_beings.2C_or_even_computers |above]], IPs do not correlate exactly to individual people. However, ISP-issued IPs tend to be a costly resource. Yes, AOL IPs rotate frequently, but you actually have to purchase an AOL connection. You might be in a NATed library, but using that IP requires your physical presence in the library. Your entire ISP may share one IP address, but again, you have to purchase a connection from that ISP. If you have a dynamic IP address not shared by that many people, perhaps just the people you live with, which tends to stay the same upon renewal, IP-based blocking works even better. However, there will always be collateral damage, and there will always be ways to bypass such blocks.
As Nick Mathewson said, 'People don't block IPs because they think IPs are people, or because they've never heard of NAT. They block IPs because IPv4 addresses are (for most people, at the moment, to a first approximation) a somewhat costly{1} resource. When they block Bob's IP, the theory is that they force him spend the effort to move to a new IP before he can abuse their service again.{2}'<ref name="nickm1" />
Ultimately, the goal is to wear the undesired user out, and get them to exercise their right to leave.<ref>{{cite web |url= http://www.usemod.com/cgi-bin/mb.pl?CommunityExile |title= CommunityExile |accessdate= 2007-06-18 |publisher= Meatball Wiki |quote= }}</ref>
The problem with open proxies is that they are not a costly enough resource - anyone in the world with an internet connection can use Tor.
== Destructive-editing-resistant Tor unblocking == '''Editor's note: Haven't actually finished what follows. It's rather outliney, sorry.'''
=== Softblock Tor IPs === ''Softblocking, as presently implemented, does not work because it is too easy to create accounts.''
In theory, by restricting Tor users to editing from an account rather than simply editing directly as Tor IP, we can then block the destructive ones individually. If account creation is uniformly disabled for open proxies, then to get an account they would have to go through the effort of emailing unblock-en-l.<ref name="jimbosoftblock"><!-- This message just shows Jimbo considering the idea of softblocking, not the technical theory I gave. -->{{cite web |url= http://archives.seul.org/or/talk/Sep-2005/msg00297.html |title= Re: Wikipedia & Tor |accessdate= 2007-06-18 |last= Wales |first= Jimmy |authorlink= Jimmy Wales |date= [[2005-09-27]] |publisher= or-talk Tor mailing list |quote= Putting Tor users into a "soft block" mode is a reasonable thing to do, but I'll have to think about how we might want to do it. }}</ref> In practice, account creation is too cheap. This didn't work.<ref name="thatcher131">{{cite web |url= {{fullurl:Wikipedia_talk:Blocking_policy|diff=next&oldid=129596558}} |title= Wikipedia talk:Blocking policy (diff) |accessdate= 2006-06-21 |author= Thatcher131 |authorlink= User:Thatcher131 |date= 19:03, [[2007-05-09]] |publisher= Wikipaedia, The Free Encyclopaedia |quote= TOR proxies were always blocked. When softblocking became possible, it was tried as an experiment. The experiment failed. The checkusers have repeatedly discovered serious vandalism coming from softblocked anonymous proxies (not just TOR). }}</ref><ref name="jayjg">{{cite web |url= {{fullurl:User_talk:Jayjg|diff=129415904&oldid=129415589}} |title= User talk:Jayjg (diff) |accessdate= 2006-06-21 |author= Jayjg |authorlink= User:Jayjg |date= 02:33, [[2007-05-09]] |publisher= Wikipaedia, The Free Encyclopaedia |quote= I'm not talking about Main page vandalism, or password cracking, which are recent and ephemeral problems. I'm talking about run of the mill nastiness. There's lots of it out there, and people are getting away with it using TOR proxies. That has to stop. }}</ref> Also see [[Wikipedia:Blocking policy proposal]].
=== Masking Tor exit nodes === ''Collecting Tor users behind a single IP address or mask would make it easier to play temporary blocking games, and to check up on edits coming through Tor.''
It is rather hard to change the blocking options for Tor exit nodes all at once, since the IPs are all over the place. If Tor exit nodes were hardblocked but a separate service, e.g. a hidden service, were set up which Tor users could connect to, that separate service could use it's own IP, giving Wikipaedia one IP to use to change the blocking options for Tor users. This would allow Wikipaedia to play games with temporary blocks during periods of widespread Tor misuse, and give Checkusers a single IP to check. It would also help give well-meaning Tor users the opportunity to establish a reputation. Freenode already does this, using two hidden services - one which any Tor user may connect to but is blocked during periods of general Tor misuse, and one which requires e-mailing the freenode staff with their public key, etc.<ref name="freenode">{{cite web |url= http://freenode.net/irc_servers.shtml#tor |title= Accessing Freenode Via Tor |accessdate= 2007-06-21 |year= (c) 2002-2007 |work= IRC Servers |publisher= Peer-Directed Projects Center |quote= }}</ref>
[[#An_old_patch |A patch]] written by Adam Langley, but never committed, would accomplish something similar.
Note that this might not do much in terms of sockpuppetry prevention/detection (discussed later).
=== Tor pseudonymity system === ''Building an authentication system into Tor would break Tor's security.''
It has been suggested that some form of authentication be implemented in Tor, in order to make it more usable. However, Tor is not a pseduonymity network. Tor is an anonymity network. Pseudonymity would basically be the equivalent adding a backdoor (i.e. intentional vulnerability) to Tor.
: 'Actually, we're not in any better position than you are. We don't know who our userbase is either; we certainly don't have identities for them, and we really don't want to track their identities or trustworthiness, for a number of reasons:
:: - If it were easy for us to tell what individual users were doing with Tor, it would be easy for *everybody* tell what individual users were doing. I wish we could separate good users from bad without seeing what they were doing, but without linking them to the actual contents of their communication, it isn't really possible.
:: - We don't want people to have to trust us with their secrets. It would make us a great target for malicious hackers and legal attacks.
:: - Our standard of trust is not likely to be anyone else's.
:: - We are not a community service; our operators don't know our users, and our users don't know each other, except when they choose to communicate on forums like this one. This is necessary for privacy: if the community knows who's who on the network, so does the Chinese government.'
: — Nick Matthewson <ref name="nickm2">{{cite web |url= http://archives.seul.org/or/talk/Sep-2005/msg00299.html |title= Re: Hello directly from Jimbo at Wikipedia |accessdate= 2007-06-20 |last= Mathewson |first= Nick |date= [[2005-09-27]] |publisher= or-talk Tor mailing list |quote= }}</ref>
=== Extra-Tor pseudonymity system === ''An authentication layer on top of Tor could be voluntary, but basing it on IPs would leave many unsolved problems.''
The pseudonym system could be external to Tor, making it optional for those who would prefer to be completely anonymous. This might be a separate authentication server, or Wikipaedia's own authentication system.
The good people of Tor [[User:Lunkwill/nym|pro]][[User talk:Lunkwill/nym|posed]] a piece of software called 'Nym' for this. Nym basically provides a thin wrapper around an IP address not believed to be a Tor exit node or other blocked IP address, so it would not be helpful to users who are truly interested in privacy or who run exit nodes. It would, however, be helpful to users who use Tor for reasons other than privacy, as well as non-Tor users who spend much but not all of their time behind some shared, frequently blocked IP address, such as a school.<ref name="nym-paper">{{cite web |url= http://lunkwill.org/cv/nym.pdf |title= nym: practical pseudonymity for anonymous networks |accessdate= 2007-06-28 |author= [http://lunkwill.org/cv/ Holt, Jason E.] |month= October |year= 2005 |quote= }} </ref><ref name="nym-ortalk-wrote">{{cite web |url= http://archives.seul.org/or/talk/Oct-2005/msg00229.html |title= Re: Wikipedia and Tor - a solution in the works? |accessdate= 2007-06-28 |last= Holt |first= Jason |date= [[2005-10-29]] |publisher= or-talk Tor mailing list |quote= }} </ref><ref name="nym-bugzilla">{{cite web |url= http://bugzilla.wikimedia.org/show_bug.cgi?id=3729 |title= Bug 3729 - Patch: SSL client certificate authentication |accessdate= 2007-06-30 |author= Jason, Evaldo Gardenali, Timo Jyrinki |publisher= Wikimedia Bugzilla |quote= }} </ref><ref name="nym-ortalk-vote">{{cite web |url= http://archives.seul.org/or/talk/Dec-2005/msg00003.html |title= Voting for nym |accessdate= 2007-06-28 |last= Holt |first= Jason |date= [[2005-12-02]] |publisher= or-talk Tor mailing list |quote= }} </ref><ref name="nym-client">{{cite web |url= http://lunkwill.org/src/nym/javascript/jsnymclient.html |title= Javascript nym interface |accessdate= 2007-06-28 |last= Holt |first= Jason E. |date= [[2006-03-04]] |quote= }} </ref><ref name="nym-source">{{cite web |url= http://lunkwill.org/src/nym/ |title= Nym source repository |accessdate= 2007-06-28 |last= Holt |first= Jason E |quote= }} </ref>
One idea for a separate authentication system involved nymbles.<ref name="nymble">{{cite web |url= http://www.petworkshop.org/2007/papers/PET2007_preproc_Nymble.pdf |title= Nymble: Anonymous IP-Address Blocking |accessdate= 2007-06-23 |author= Peter C. Johnson, Apu Kapadia, Patrick P. Tsang, and Sean W. Smith |year= 2007 |publisher= [http://www.petworkshop.org/2007/program.php 7th Workshop on Privacy Enhancing Technologies], Ottowa, Canada |quote= }} </ref> Some flaws: * While it does provide more layers of protection than Nym, it is still significantly less secure than Tor - the users with the greatest perceived need for privacy probably won't use it. * Since it still relies on IP addresses as a scarce resource, it would not help Tor exit node operators. * Less entrance blocking resistance than Tor - the Chinese ISPs might actually bother to block it.
A pseudonymity system which uses some scarce resource besides IP addresses would better protect the privacy of Tor users.
=== Increasing the cost of nym creation === ''Non-proxy IP addresses work economically because they are scarce, but there are other scarce resources; however human resources work better than computer resources.''
<!-- Editor's note: There was significant discussion in the or-talk mailing list archives on this, which I should really look up. -->Basically, there are a number of resources of varying degrees of scarcity, not only IP addresses but also e-mail addresses (ISP-issued ones are more expensive than free ones), puzzle-solving, etc. Proof-of-work where a computer could do the work doesn't work.<ref name="proofofworkprovesnottowork">{{cite web |url= http://www.cl.cam.ac.uk/~rnc1/proofwork.pdf |title= "Proof-of-Work" Proves Not to Work |accessdate= 2007-06-28 |author= Ben Laurie and Richard Clayton |publisher= University of Cambridge |quote= }} </ref> Hence, how can we raise the human-effort cost of nym creation? (Note that traditional, non-proxy IPs are theoretically somewhat costly in human effort to change, which is why [[#IP-based_blocking.2C_and_why_it_works_better_than_nothing |IP-based blocking works better than nothing]].) The goal is to make it difficult enough to authenticate as a good user via Tor, that most of the bad users will simply switch to not-good open proxies rather than go through the whole thing in exchange for a few minutes of vandalism, while good users will only need to go through the process once (assuming they don't [[#When_pseudonymity_isn.27t_anonymous_enough |find such a pseudonym to compromise their anonymity too much]]).
As Jimbo said, : 'For now the key thing to do is to shift the incentives on the bad users so that Tor is less desirable for them than playing with the broken proxies or just doing whatever with a dialup account or aol addresses or whatever.'<ref name="jimbo-incentives"> {{cite web |url= http://archives.seul.org/or/talk/Sep-2005/msg00332.html |title= Re: Wikipedia & Tor |accessdate= 2007-06-28 |last= Wales |first= Jimmy |authorlink= Jimmy Wales |date= [[2005-09-28]] |publisher= or-talk Tor mailing list |quote= }}</ref>
An excerpt of Jimbo's comments from later in the conversation, : 'So the _degree_ of trust we need is actually quite small. It isn't "We certify this person to be a certain user, guaranteed, the same as ever". It's just "this packet is being sent to you from a source that has somehow tended generally to lead us to believe to some small extent that the person posting it has not been a jackass, by and large".
: 'Or, as has been brilliantly discussed here already, it could be "this packet has been sent to you via a mechanism that one might bother to use, were one a dissident really needing anonymity, but sufficiently bothersome that were one simply a lunatic on crack, one would more likely have simply switched to using anonymous proxies".
: 'It won't be perfect, but as an empirical matter, it's probably good enough.'<ref name="jimbobothersome">{{cite web |url= http://archives.seul.org/or/talk/Sep-2005/msg00356.html |title= Re: Hello directly from Jimbo at Wikipedia |accessdate= 2007-06-28 |last= Wales |first= Jimmy |authorlink= Jimmy Wales |date= [[2005-09-29]] |publisher= or-talk Tor mailing list |quote= }}</ref>
=== Increasing the human-effort of nym creation === ''Requiring Tor users to create or improve an article to be unblocked could actually provide a much higher level of security for Wikipaedia than IP-based authentication, although it depends on the effort required of them.''
While we could do some complex puzzle system,<ref name="murdoch">{{cite web |url= http://archives.seul.org/or/talk/Sep-2005/msg00340.html |title= Re: Hello directly from Jimbo at Wikipedia |accessdate= 2007-06-28 |last= Murdoch |first= Steven J. |date= [[2005-09-29]] |publisher= or-talk Tor mailing list |quote= }}</ref> remember that we are an encyclopaedia project, so the most natural puzzle for us would be the improvement of a Wikipaedia article. Blocked users can still edit their talk pages. So, the user first creates an account - we can either hardblock Tor IPs with account creation enabled, or they can e-mail unblock-en-l. Once they get an account, they can now edit their talk page. Paste neglected Wikipaedia article into talk page. Improve Wikipaedia article, using {{tl|helpme}} as needed. After going through this effort and authenticating as a good user, use {{tl|unblock}}. The user may be granted ipblock-exempt (see [http://bugzilla.wikimedia.org/show_bug.cgi?id=3706 bug report] and [[Wikipedia:Wikipedia Signpost/2007-01-08/Technology report|Signpost]]), meaning the hardblocks won't apply to them. If they do start causing trouble, they can be easily blocked. A similar idea is suggested in {{tl|2ndchance}}.
This should in fact provide greater security than IP-based authentication. Consider the difference between 'assessment signals' and 'conventional signals'. Assessment signals cannot easily be faked, since showing the signal requires possessing the quality. 'Conventional signals' are cheap but can easily be faked, diluting their value. It's the difference in showing strength by having a thick neck and wearing a t-shirt that says 'Gold's Gym Powerlifter'. The writing of an encyclopaedia article is an assessment signal, since it shows a) willingness to put in effort, consistent with a real concern for privacy and b) skill (or at least interest) in contributing to the encyclopaedia. The possession of a non-proxy IP address is only a conventional signal, and not even one which says much.<ref name="donath-modelsofhonestyanddeception">{{cite web |url = http://smg.media.mit.edu/people/Judith/Identity/IdentityDeception.html#29347 |title = 'Models of honesty and deception'. Identity and deception in the virtual community |accessdate = 2007-07-09 |last = Donath |first = Judith S. |date = [[1996-11-12]] |work = Communities in cyberspace |publisher = Berkeley: University of California Press |pages = 1 |quote = }} Also available on [http://citeseer.ist.psu.edu/donath97identity.html CiteSeer] ([http://citeseer.ist.psu.edu/rd/0%2C63811%2C1%2C0.25%2CDownload/http://citese... PostScript], [http://citeseer.ist.psu.edu/rd/0%2C63811%2C1%2C0.25%2CDownload/http://citese... PDF], [http://citeseer.ist.psu.edu/cachedpage/63811/1 Image]) </ref>
Jimbo suggested a similar system: :'But, we could do something like: allow non-logged in posts, and allowed posts with Tor *for trusted accounts*, but not non-logged-in posts with Tor, and not logged-in-but-not-yet-trusted accounts with Tor.
:'Still, there's a flaw: this means you have to come around to Wikipedia in an non-Tor manner long enough for us to trust you, which pretty much blows the whole point of privacy to start with.' <ref name="jimbotrust">{{cite web |url= http://archives.seul.org/or/talk/Sep-2005/msg00292.html |title= Re: Hello directly from Jimbo at Wikipedia |accessdate= 2007-06-28 |last= Wales |first= Jimmy |authorlink= Jimmy Wales |date= [[2005-09-27]] |publisher= or-talk Tor mailing list |quote= }}</ref>
The above fills in the question of how to become trusted without sacrificing privacy.
Problem: ipblock-exempt is only available to admins right now.<ref name="bug3706"> http://bugzilla.wikimedia.org/show_bug.cgi?id=3706 </ref><ref name="mediawikirev18904">http://svn.wikimedia.org/viewvc/mediawiki?view=rev&revision=18904</ref> The code to add a separate group for ipblock-exempt, and even allow bureaucrats or admins to add people to this group, has been written, but the Wikimedia developers do not want to commit it. Basically, the developers want a more elegant solution that will make the userrights interface more modular in general, rather than a simple hack for this one group.<ref name="bug9862">{{cite web |url= http://bugzilla.wikimedia.org/show_bug.cgi?id=9862 |title= Bug 9862 - Separate group for ipblock-exempt on en.wikipedia |accessdate= 2007-06-21 |author= Armed Blowfish, Rob Church, Martinp23, Simetrical, ^demon |year= 2007 |publisher= Wikimedia Bugzilla |quote= }}</ref><ref name="bug6711">{{cite web |url= http://bugzilla.wikimedia.org/show_bug.cgi?id=6711 |title= Bug 6711 - More modular userrights interface |accessdate= 2007-06-21 |author= Simetrical, Rotem Liss, Titoxd, Max Semenik, et. al |publisher= Wikimedia Bugzilla |quote= }}</ref> While Wikipaedia could theoretically make people they want to unblock admins,<ref name="ipblock-exempt-wikitech">{{cite web |url= http://www.gossamer-threads.com/lists/wiki/wikitech/79496 |title= New ipblock-exempt permission |accessdate= 2007-06-22 |author= Andrew, Jepe, Simetrical and Nospam |year= 2007 |month= January |publisher= Wikitech Mailing List |quote= }}</ref> it is [[Wikipedia:Requests for adminship/Armedblowfish|highly]] [[Wikipedia:Requests for adminship/CharlotteWebb|unlikely]] that any Tor users, with the possible exception of Chinese ones, will be trusted by the community enough to attain adminship. Also note that Jimbo has said that ipblock-exempt, if it were available, would not cost money.<ref name="jimbonomoney">{{cite web |url= http://archives.seul.org/or/talk/Sep-2005/msg00387.html |title= Re: Hello directly from Jimbo at Wikipedia |accessdate= 2007-06-28 |last= Wales |first= Jimmy |authorlink= Jimmy Wales |date= [[2005-09-30]] |publisher= or-talk Tor mailing list |quote= }}</ref>
If community feeling on the matter were different, an interesting criterion for a Tor user or exit node operator, with no contributions outside of his or her talk page, might be the old 'One Featured Article'. See [[User:Miborovsky/1FA]], [[User:Jguk/admin criterion]], and the <span class="plainlinks">[{{fullurl:Special%3ALog|type=delete&user=&page=User%3AMailer+diablo%2FOne+Featured+Article}} deletion log of User:Mailer diablo/One Featured Article].</span> Given the difficulty someone unable to edit outside of their talk page might have writing a featured article, this question is probably academic, but: Given that there is no way to individually unblock a Tor user or exit node operator besides granting adminship, should that Tor user or exit node operator be granted adminship if he or she writes a featured article on his or her talk page? Since the human-effort cost of writing a featured article is far higher than the human effort cost of maintaining two separate IP addresses/ranges, I would say yes.
== Sybil attack prevention and detection == ''To prevent most Sybil (sockpuppet) attacks, make creating accounts time-consuming; to detect the persistent ones, use writing and behavioural analysis.''
Defenses against [[Sybil attack]]s fall in two categories - prevention and detection. Traditional Sybil attack prevention lies in increasing the cost of nym creation, [[#Increasing_the_cost_of_nym_creation |as described above]]. The higher the [[#Increasing_the_human-effort_of_nym_creation |human-effort]] cost of nym creation, the less Sybils an attacker will have time to create. However, as not all people have the same amount of time, this does have limitations. According to Judith S. Donath of MIT Media Lab, 'One can have, some claim, as many electronic personas as one has time and energy to create.'<ref name="donath1">{{cite web |url = http://smg.media.mit.edu/people/Judith/Identity/IdentityDeception.html |title = Identity and deception in the virtual community |accessdate = 2007-07-09 |last = Donath |first = Judith S. |date = [[1996-11-12]] |work = Communities in cyberspace |publisher = Berkeley: University of California Press |pages = 1 |quote = }} Also available on [http://citeseer.ist.psu.edu/donath97identity.html CiteSeer] ([http://citeseer.ist.psu.edu/rd/0%2C63811%2C1%2C0.25%2CDownload/http://citese... PostScript], [http://citeseer.ist.psu.edu/rd/0%2C63811%2C1%2C0.25%2CDownload/http://citese... PDF], [http://citeseer.ist.psu.edu/cachedpage/63811/1 Image]) </ref>
One could attempt to verify that each online persona represents a different actual person by demanding legal ID. However, this would this exclude about one third of the world's population - over two billion people - according to UNICEF. And not an equally distributed on third, either, but a one third mostly living in areas which are already underrepresented on Wikipaedia, hence increasing the [[Wikipedia:WikiProject Countering systemic bias |systemic bias]] problem.<ref name="UNICEF1">{{cite news | title = Birth registration: The 'first' right | url = http://www.unicef.org/pon98/civil1.htm | work = The Progress of Nations 1998 | publisher = [[UNICEF]] | date = | accessdate = 2007-07-09 | quote = Every year, about 40 million babies -- one third of all births -- go unregistered around the world. }}</ref><ref name="UNICEF2">{{cite news | title = Millions are 'missing' | url = http://www.unicef.org/pon98/civil2.htm | work = The Progress of Nations 1998 | publisher = [[UNICEF]] | date = | accessdate = 2007-07-09 | quote = The obstacles to registration are often banal, the product of misplaced priorities and bureaucratic inadequacies. Poor and rural countries tend to have lower registration rates, struggling as they must to cope with the inevitable shortages of trained personnel and modern technology, the logistical problems of travelling to registry offices and ignorance or fear of the process. As a result, birth registration lags in countries such as Sierra Leone, which has a registration rate of less than 10 per cent; Zimbabwe, with around one third registered; and Bolivia, where about half the people have a birth certificate. }}</ref> It would discourage many others, defeating the point of privacy far more than an IP address. Not to mention [[identity document forgery]]. Besides, just because someone has a legal name does not tell us anything about who they really are.
: ' 'Tis but thy name that is my enemy; : Thou art thyself, though not a Montague. : What's Montague? it is nor hand, nor foot, : Nor arm, nor face, nor any other part : Belonging to a man. O, be some other name! : What's in a name? that which we call a rose : By any other name would smell as sweet; : So Romeo would, were he not Romeo call'd, : Retain that dear perfection which he owes : Without that title. Romeo, doff thy name, : And for that name which is no part of thee : Take all myself.' : — William Shakespeare <ref name="shakespeare">{{cite book | last = Shakespeare | first = William | authorlink = William Shakespeare | title = Romeo and Juliet | origyear = Around 1597 | url = http://shakespeare.mit.edu/romeo_juliet/ | accessdate = 2007-07-09 | location = England | chapter = Act II, Scene II | chapterurl = http://shakespeare.mit.edu/romeo_juliet/romeo_juliet.2.2.html | quote = | ref = }}</ref>
'That which we call a rose by any other name would smell as sweet'. Even without connecting online personas to real entities, online personas still possess recognisable qualities which alert us to Sybils. Hence, writing and behavioural analysis. Wikipaedia already does this, as explained in the [[#Current_Sybil_attack_detection_practice_on_Wikipaedia |next section]]. <!-- http://archives.seul.org/or/talk/Jan-2005/msg00115.html -->
<span id="Current_Sybil_attack_detection_practice"></span><span id="Current_misused_sockpuppet_detection_practice"></span>
== Current Sybil attack detection practice on Wikipaedia == ''IPs alone have never been enough to detect Sybils; Wikipaedia has always relied on writing and behavioural analysis.''
Many users can be blocked without know if they are Sybils, so long as we know they are harming Wikipaedia. IP addresses don't correspond one-to-one to people, but Wikipaedia doesn't need to ask ever pair of editors who live together, use the same public facilities, share the same ISP, or live in the same geographic region to prove that they are, in fact, separate people. Wikipaedia only [[WP:RFCU |performs checkuser]] after there is suspicion - two editors sound like the same person. Many times, Wikipaedia can [[WP:SSP |reach a conclusion]] on this without ever asking a checkuser.
<span id="Figuring_out_whether_or_not_Alice_and_Alex_are_the_same_person.2C_without_knowing_who_in_the_world_Alice_and_Alex_are"></span>
== Figuring out whether pseudonymous identities are Sybils, without knowing who in the world they are == ''Writing and behavioural analysis based Sybil detection still works on Tor users, and should be enough most of the time, especially if you also make it more time-consuming to create accounts.''
As noted above, you all already do this much of the time. A user known to be using Tor may reasonably expect Wikipaedia to do the best [[WP:SSP|non-IP Sybil investigation]] that is possible before worrying about IP evidence, out of respect for the user's privacy. However, writing analysis can sometimes be inconclusive. However, until the non-IP Sybil investigation is inconclusive, we don't need to worry about the lack of available IP evidence. Tor-using editors who are never suspected, or who can be determined to be or not be Sybils based on writing analysis, need never be asked to reveal their IP address. Note that if you increase the cost of nym creation through Tor, Tor will not be particularly attractive to Sybil, and the level of human effort cost required to conduct subtle misuse of Sybils is greater than the human effort cost required to change IPs.
=== When that fails... === ''On the rare occasion that writing and behavioural analysis is not enough to determine if a Tor user is a Sybil, it is reasonable to ask the Tor user for their IP then, but not before.''
If we really cannot figure out if an editor is misusing Sybils without the aid of IP evidence (which is probabilistic anyway), it is reasonable to grant that user a choice: * Remain pseudonymous, but consent to allow Wikipaedia to assume that the editor is a Sybil. (What we have been trying to avoid, but a necessary fail-safe for the protection of the editor's pseudonymity and Wikipaedia's integrity.) * Voluntarily share their IP address with a checkuser. Wikipaedia can try to do a number of things to make this option more palatable: ** By not asking until this point, Wikipaedia demonstrates respect for the editor's privacy, which may make the editor more trusting. ** By not asking until this point, the editor may have a chance to establish a trusting relationship with a checkuser prior to being asked to reveal their IP. ** The editor may ask that their IP only be revealed to one checkuser, and that the checkuser not share the information with other checkusers.<ref name="edda">{{cite book | others = Translated by Olive Bray | title = [[Poetic Edda]] | origdate = No earlier than about [[985]] | accessdate = 2007-06-19 | language = English | chapter = Hávamál, Wisdom for Wanderers and Counsel to Guests | chapterurl = http://www.pitt.edu/~dash/havamal.html#wanderers | quote = Each man who is wise and would wise be called <br /> must ask and answer aright. <br /> Let one know thy secret, but never a second, -- <br /> if three a thousand shall know. }}</ref> This is not a problem, we trust our checkusers to make these determinations by themselves. ** The editor may and probably should ask that the checkuser provide a [[GNU Privacy Guard|GPG]] public key, preferably one which has been verified to some extent to belong to the checkuser, such that when the editor send the checkuser their IP address, no one else will be able to intercept the message. (Unencrypted communication is the equivalent of a postcard: anyone who can intercept it can read it, and no one will ever know the message was intercepted.) ** The editor may ask that the checkuser simply make a determination on the probability that the editor is a Sybil, without providing any reasoning, e.g. no disclosure of the editor's geographic location. It's okay, we trust checkusers to make this determination.
In return for heightened IP security, Wikipaedia may reasonably ask the editor for whatever confirmation that their IP address is theirs is possible. Some possibilities: * A screenshot of a website, like [http://www.ip-adress.com/ ip-address.com], that says your external IP address. (Advantages: Very easy. Weaknesses: Image manipulation, not helpful for Tor exit node operators.) * An e-mail sent from an ISP e-mail or directly from one's computer. (Advantages: Still easy, a bit more secure than a screenshot, is useful for Tor exit node operators. Disadvantages: [[SMTP]] is not secure.) * Give the checkuser a temporary [[OpenSSH]] account on the computer. (Advantages: Securely proves that the editor has high enough access on the computer to create accounts, making this a fairly sure method, even if the editor is running a Tor exit node. Disadvantages: Harder than other methods, requires a *nix operating system, editor may be uncomfortable granting an account to someone else, or opening their SSH port. Also not helpful to people using public computers.) * It may also be helpful for an editor with a dynamic IP address to give the checkuser a domain or subdomain name linked to that dynamic IP address, if they have one. [http://dyndns.com DynDNS] provides such subdomains for free.
== Making sure that Alice is indeed Alice == ''Tor users are a bit more vulnerable to having their passwords sniffed over unencrypted login; solutions involve encryption.''
If you log in via [http://en.wikipedia.org/wiki/Special:Userlogin en.wikipedia.org/wiki/Special:Userlogin], you are sending your user name and password over an unencrypted connection, which is the equivalent of a post card - anyone who can intercept it can read it, no trouble. Tor users may be more vulnerable since Tor only encrypts up to the exit node, and they use so many circuits, with various different exit nodes<!-- Everything from Alice to her exit node is encrypted (at various levels); it is not apparent how exit node to Wikipedia would be likely to entail more hops. Saying that Tor is likely to be monitored either by hostile parties running Tor nodes or outside forces is more likely true, but the governments are spying on everyone anyways... (Carnivore and EarthLink; EFF, NSA, AT&T; etc.) For non-government entities I would be much more worried about people using wireless at Wikimania than someone stealing your password via Tor. — The preceding comment added by Kotepho.
The problem is, Alice keeps switching exit nodes, every 30 seconds to half hour, usually about every 5 minutes. So if she uses 100 different exit nodes, that's 100 chances that the exit node her data goes through might be logging... not to mention each exit node has a different path between it and the Wikimedia servers, increasing the number of data paths, each of which might have someone sniffing packets. — Armed Blowfish
You are only going to autheticate with your password once per session; all other authetication is done with the edit token. They would be able to do most actions with the account—everything except setting a new password at a cursory glance as that requires the old password, but they could set a new email(why not require the password to set a new email too?) and have a new password emailed to them, and then log in with the temporary password and set a new one and take over the account (reading the source in viewvc is non-optimal, but I do not see the cases were the edittoken is invalidated other than confirming your email and changing your password, not even on logout?)—but they do not get a chance at /your/ password more than once per log in attempt. There is also the possibility of active rather than passive attacks such as faking a log out or actually logging you out to get your password. It does indeed present some problems with making sure Alice is Alice for a specific action ([[non-repudiation]], essentially), but not after the fact (at least once there is some way for someone to revoke your edittoken and no way to change the password without the password, c.f. the compromised accounts recently proving they were the rightful owners so they could get rights back). [Sorry if I'm being too pedantic.] — Kotepho
Right, but each login a Tor user may be using a different exit node... in Sweden, then the United States, then Russia, then the United Kingdom, then Canada... and on, different exit nodes from all around the world at each login. (Well, there are some repeats, but still.) — Armed Blowfish -->, allowing for more possible points of interception. (If Alice logs in over 100 different exit nodes, that is at least 100 different paths from exit node to the Wikimedia Foundation that might be [[packet sniffing|sniffed]].) [https://secure.wikimedia.org/wikipedia/en/wiki/Special:Userlogin secure.wikimedia.org] is encrypted, but too many users using it could overload the server, not to mention that [[Transport Layer Security|TLS]] is far from perfect. A hidden service would be fantastic, but this may not be cost-effective for the Wikimedia Foundation.<ref name="dipierro1">{{cite web |url= http://archives.seul.org/or/talk/Oct-2005/msg00228.html |title= Wikipedia and Tor - a solution in the works? |accessdate= 2007-06-21 |last= DiPierro |first= Anthony |date= [[2005-10-29]] |publisher= or-talk Tor mailing list |quote= }}</ref><ref name="tiwaz">{{cite web |url= http://archives.seul.org/or/talk/Oct-2005/msg00246.html |title= RE: Wikipedia and Tor - a solution in the works? |accessdate= 2007-06-21 |last= tiwaz |first= loki |date= [[2005-10-31]] |publisher= or-talk Tor mailing list |quote= }}</ref>
You could ask Tor users to provide a [[GNU Privacy Guard|GPG]] [[Public/private key cryptography|public key]], which would allow them to confirm that they are the same person as they were before in the future. Note that this would not prevent their accounts from being hijacked, only provide a way for them to confirm their identity to another Wikipaedian. This has the benefit of increasing the cost of nym creation - although anyone can generate a GPG public-private key pair, it takes time to do this. Note that Freenode already requires this from Tor users as a condition for using the better of their two hidden services - the one that does not get banned during periods of general Tor misuse.<ref name="freenode">{{cite web |url= http://freenode.net/irc_servers.shtml#tor |title= Accessing Freenode Via Tor |accessdate= 2007-06-21 |year= (c) 2002-2007 |work= IRC Servers |publisher= Peer-Directed Projects Center |quote= }}</ref>
Note that some people may enjoy the plausible deniability granted by a weak authentication system.<ref name="dipierro2">{{cite web |url= http://archives.seul.org/or/talk/Oct-2005/msg00248.html |title= Re: Wikipedia and Tor - a solution in the works? |accessdate= 2007-06-20 |last= DiPierro |first= Anthony |date= [[2005-10-31]] |publisher= or-talk Tor mailing list |quote= The reality is that it's not that huge of a deal if someone finds out my Wikipedia password. In fact, in some ways it's a feature in that I can repudiate any edit made using my account. }}</ref>
== When pseudonymity isn't anonymous enough == ''Pseudonyms may present a dangerous privacy risk for some Tor users, but there is not much we can do besides invite them to edit Tor IP talk pages.''
Pseudonymity is by nature less anonymous that anonymity. If you can say "this anonymous person is the same as that anonymous person", then if you ever figure out who the anonymous person is, you can hold them accountable for all of their pseudonymous actions. This may be dangerous for some people.<ref name="eward">{{cite web |url= http://archives.seul.org/or/talk/Oct-2005/msg00002.html |title= Re: Hello directly from Jimbo at Wikipedia |accessdate= 2007-06-21 |last= Eward |first= Dustin |date= [[2005-10-01]] |publisher= or-talk Tor mailing list |quote= }}</ref> However, you can't very well authenticate yourself as trusted if you are not willing to establish a pseudonym. Therefore, I am not sure there is much we can do for these people besides invite them to place {{tl|helpme}} requests on various Tor exit node IP talk pages.
== If you all are going hardblock Tor, you should really do a better job of it == ''Wikipaedia's current inaccurate Tor blocking methods are bad for both Wikipaedia and Tor, but the thoughtful Tor developers have provided several solutions.''
Tor is a highly dynamic network - the IPs addresses of exit nodes change frequently. In addition, exit nodes have individual exit policies. Wikipaedia's current blocking methods result in a number of false positives and false negatives. Exit nodes which do not exit to Wikipaedia are blocked, as are IPs that used to be exit nodes, while other exit nodes are missed. The latter is a problem since any Tor user who can edit a config file can pick which exit nodes they want or don't want to use, making the current hardblock on Tor quite circumventable by a user determined to do so.<ref name="faq-abuse-bans">{{cite web |url= http://tor.eff.org/faq-abuse.html.en#Bans |title= Abuse FAQ for Tor Server Operators: I want to ban the Tor network from my service |accessdate= 2007-06-20 |date= [[2007-06-17]] |publisher= The Tor Project |quote= }}</ref><ref name="config">{{cite web |url= http://cvs.seul.org/viewcvs/viewcvs.cgi/tor/trunk/src/config/torrc.complete.... |title= Sample Tor configuration file: torrc.complete.in (rev 10570) |accessdate= 2007-06-20 |date= [[2007-06-12]] |publisher= The Tor Project |quote= }}</ref>
There is no reason this should be a problem. The Tor developers respect Wikipaedia's self-determination to block exit nodes, and have even written programs to help sites like Wikipaedia do this as accurately as possible. Nick Mathewson explains this, : 'We're okay with subverting entry-blocks. This isn't hypocrisy; this is because entry blocks are fundamentally different. When Alice connects to Tor to connect to Bob, an exit block means that Bob doesn't want anonymous connections, whereas an entry block means that somebody doesn't want Alice to have privacy. Entry blocking subverts Alice's self-determination, whereas exit blocking on Bob's part *is* self-determination, even if we don't like it.'<ref name="nickm1" />
The Tor developers offer two good ways to do this: [http://exitlist.torproject.org/ TorDNSEL] and the Python [http://cvs.seul.org/viewcvs/viewcvs.cgi/tor/trunk/contrib/exitlist?rev=10402... exitlist]. Although TorDNSEL is better, or so the Tor developers told me, I have not actually read [http://tor.eff.org/svn/trunk/doc/contrib/torel-design.txt the documentation] yet.
Anyway, the Python exitlist script offers the following advantages: * Since it uses the same data the Tor client uses, it is as up-to-date as the client data. (Note that this means you have to actually run Tor in order to get the data.) * Accurately parses exit policies so only exit nodes that actually exit to Wikipaedia are listed. * Minimal false negatives and false positives. Yes, it can get slightly out of date, it can only resist as many bad authorities as the Tor client itself, and it can't get all of the multi-homed exit node IPs, but it is better than the other lists you use. Regarding the multi-homed exit nodes, [http://peertech.org/pub/tor-dissim-p80-exits.txt this list] gives the real exiting IPs of such exit nodes which exit on port 80, rather than the advertised IPs. * It generates a nice, happy, easily parsable list of IPs with one IP address per line. This would be great if the community were willing to give adminship to a bot such as [[Wikipedia:Requests for adminship/TawkerbotTorA|TawkerbotTorA]].
For your convenience, I have included the exitlist script with the minor modifications that it looks for nodes exiting to Wikipaedia and includes the script's license.
On *nix, or at least [[OpenBSD]] (probably other *nixes too, but I tested on OpenBSD), it runs with the following command: <pre> cat ~/.tor/cached-routers* | python exitlist > torexitwikipaedia </pre>
I don't know what works on Windows.
=== Modified exitlist code === <pre> #!/usr/bin/python # Copyright 2005-2006 Nick Mathewson # See the LICENSE file in the Tor distribution for licensing information. # The following licensing information copied by Armed Blowfish # You can find the updated exitlist code here: # http://tor.eff.org/svn/trunk/contrib/exitlist # You can find the Tor license here: # http://tor.eff.org/svn/trunk/LICENSE # Relevant portion of Tor license file: # Tor is distributed under this license:
# Copyright (c) 2001-2004, Roger Dingledine # Copyright (c) 2004-2007, Roger Dingledine, Nick Mathewson
# Redistribution and use in source and binary forms, with or without # modification, are permitted provided that the following conditions are # met:
# * Redistributions of source code must retain the above copyright # notice, this list of conditions and the following disclaimer.
# * Redistributions in binary form must reproduce the above # copyright notice, this list of conditions and the following disclaimer # in the documentation and/or other materials provided with the # distribution.
# * Neither the names of the copyright owners nor the names of its # contributors may be used to endorse or promote products derived from # this software without specific prior written permission.
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS # "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT # LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR # A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT # OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, # SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT # LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, # DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY # THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. # End info copied by Armed Blowfish. When updating to the latest # version of the exit list, don't forget to include licensing information # and make ouput Wikipaedia-specific.
# Requires Python 2.2 or later.
""" exitlist -- Given a Tor directory on stdin, lists the Tor servers that accept connections to given addreses.
example usage (Tor 0.1.0.x and earlier):
python exitlist 18.244.0.188:80 < ~/.tor/cached-directory
example usage (Tor 0.1.1.10-alpha and later):
cat ~/.tor/cached-routers* | python exitlist 18.244.0.188:80
If you're using Tor 0.1.1.18-rc or later, you should look at the "FetchUselessDescriptors" config option in the man page.
Note that this script won't give you a perfect list of IP addresses that might connect to you using Tor, since some Tor servers might exit from other addresses than the one they publish.
"""
# # Change this to True if you want more verbose output. By default, we # only print the IPs of the servers that accept any the listed # addresses, one per line. # VERBOSE = False
# # Change this to True if you want to reverse the output, and list the # servers that accept *none* of the listed addresses. # INVERSE = False
# # Change this list to contain all of the target services you are interested # in. It must contain one entry per line, each consisting of an IPv4 address, # a colon, and a port number. This default is only used if we don't learn # about any addresses from the command-line. # Made this part Wikipaedia-specific. -- Armed Blowfish # 66.230.200.100:80 is en.wikipedia.org # 66.230.200.219:443 is secure.wikimedia.org ADDRESSES_OF_INTEREST = """ 66.230.200.100:80 66.230.200.219:443 """
# # YOU DO NOT NEED TO EDIT AFTER THIS POINT. #
import sys import re import getopt import socket import struct import time
assert sys.version_info >= (2,2)
def maskIP(ip,mask): return "".join([chr(ord(a) & ord(b)) for a,b in zip(ip,mask)])
def maskFromLong(lng): return struct.pack("!L", lng)
def maskByBits(n): return maskFromLong(0xffffffffl ^ ((1L<<(32-n))-1))
class Pattern: """ >>> import socket >>> ip1 = socket.inet_aton("192.169.64.11") >>> ip2 = socket.inet_aton("192.168.64.11") >>> ip3 = socket.inet_aton("18.244.0.188")
>>> print Pattern.parse("18.244.0.188") 18.244.0.188/255.255.255.255:1-65535 >>> print Pattern.parse("18.244.0.188/16:*") 18.244.0.0/255.255.0.0:1-65535 >>> print Pattern.parse("18.244.0.188/2.2.2.2:80") 2.0.0.0/2.2.2.2:80-80 >>> print Pattern.parse("192.168.0.1/255.255.00.0:22-25") 192.168.0.0/255.255.0.0:22-25 >>> p1 = Pattern.parse("192.168.0.1/255.255.00.0:22-25") >>> import socket >>> p1.appliesTo(ip1, 22) False >>> p1.appliesTo(ip2, 22) True >>> p1.appliesTo(ip2, 25) True >>> p1.appliesTo(ip2, 26) False """ def __init__(self, ip, mask, portMin, portMax): self.ip = maskIP(ip,mask) self.mask = mask self.portMin = portMin self.portMax = portMax
def __str__(self): return "%s/%s:%s-%s"%(socket.inet_ntoa(self.ip), socket.inet_ntoa(self.mask), self.portMin, self.portMax)
def parse(s): if ":" in s: addrspec, portspec = s.split(":",1) else: addrspec, portspec = s, "*"
if addrspec == '*': ip,mask = "\x00\x00\x00\x00","\x00\x00\x00\x00" elif '/' not in addrspec: ip = socket.inet_aton(addrspec) mask = "\xff\xff\xff\xff" else: ip,mask = addrspec.split("/",1) ip = socket.inet_aton(ip) if "." in mask: mask = socket.inet_aton(mask) else: mask = maskByBits(int(mask))
if portspec == '*': portMin = 1 portMax = 65535 elif '-' not in portspec: portMin = portMax = int(portspec) else: portMin, portMax = map(int,portspec.split("-",1))
return Pattern(ip,mask,portMin,portMax)
parse = staticmethod(parse)
def appliesTo(self, ip, port): return ((maskIP(ip,self.mask) == self.ip) and (self.portMin <= port <= self.portMax))
class Policy: """ >>> import socket >>> ip1 = socket.inet_aton("192.169.64.11") >>> ip2 = socket.inet_aton("192.168.64.11") >>> ip3 = socket.inet_aton("18.244.0.188")
>>> pol = Policy.parseLines(["reject *:80","accept 18.244.0.188:*"]) >>> print str(pol).strip() reject 0.0.0.0/0.0.0.0:80-80 accept 18.244.0.188/255.255.255.255:1-65535 >>> pol.accepts(ip1,80) False >>> pol.accepts(ip3,80) False >>> pol.accepts(ip3,81) True """
def __init__(self, lst): self.lst = lst
def parseLines(lines): r = [] for item in lines: a,p=item.split(" ",1) if a == 'accept': a = True elif a == 'reject': a = False else: raise ValueError("Unrecognized action %r",a) p = Pattern.parse(p) r.append((p,a)) return Policy(r)
parseLines = staticmethod(parseLines)
def __str__(self): r = [] for pat, accept in self.lst: rule = accept and "accept" or "reject" r.append("%s %s\n"%(rule,pat)) return "".join(r)
def accepts(self, ip, port): for pattern,accept in self.lst: if pattern.appliesTo(ip,port): return accept return True
class Server: def __init__(self, name, ip, policy, published, fingerprint): self.name = name self.ip = ip self.policy = policy self.published = published self.fingerprint = fingerprint
def uniq_sort(lst): d = {} for item in lst: d[item] = 1 lst = d.keys() lst.sort() return lst
def run(): global VERBOSE global INVERSE global ADDRESSES_OF_INTEREST
if len(sys.argv) > 1: try: opts, pargs = getopt.getopt(sys.argv[1:], "vx") except getopt.GetoptError, e: print """ usage: cat ~/.tor/cached-routers* | %s [-v] [-x] [host:port [host:port [...]]] -v verbose output -x invert results """ % sys.argv[0] sys.exit(0)
for o, a in opts: if o == "-v": VERBOSE = True if o == "-x": INVERSE = True if len(pargs): ADDRESSES_OF_INTEREST = "\n".join(pargs)
servers = [] policy = [] name = ip = None published = 0 fp = "" for line in sys.stdin.xreadlines(): if line.startswith('router '): if name: servers.append(Server(name, ip, Policy.parseLines(policy), published, fp)) _, name, ip, rest = line.split(" ", 3) policy = [] published = 0 fp = "" elif line.startswith('fingerprint') or \ line.startswith('opt fingerprint'): elts = line.strip().split() if elts[0] == 'opt': del elts[0] assert elts[0] == 'fingerprint' del elts[0] fp = "".join(elts) elif line.startswith('accept ') or line.startswith('reject '): policy.append(line.strip()) elif line.startswith('published '): date = time.strptime(line[len('published '):].strip(), "%Y-%m-%d %H:%M:%S") published = time.mktime(date)
if name: servers.append(Server(name, ip, Policy.parseLines(policy), published, fp))
targets = [] for line in ADDRESSES_OF_INTEREST.split("\n"): line = line.strip() if not line: continue p = Pattern.parse(line) targets.append((p.ip, p.portMin))
# remove all but the latest server of each IP/Nickname pair. latest = {} for s in servers: if (not latest.has_key((s.fingerprint)) or s.published > latest[(s.fingerprint)]): latest[s.fingerprint] = s servers = latest.values()
accepters, rejecters = {}, {} for s in servers: for ip,port in targets: if s.policy.accepts(ip,port): accepters[s.ip] = s break else: rejecters[s.ip] = s
# If any server at IP foo accepts, the IP does not reject. for k in accepters.keys(): if rejecters.has_key(k): del rejecters[k]
if INVERSE: printlist = rejecters.values() else: printlist = accepters.values()
ents = [] if VERBOSE: ents = uniq_sort([ "%s\t%s"%(s.ip,s.name) for s in printlist ]) else: ents = uniq_sort([ s.ip for s in printlist ]) for e in ents: print e
def _test(): import doctest, exitparse return doctest.testmod(exitparse) #_test()
run() </pre>
=== An old patch ===
Adam Langley wrote a patch that would make it easier to play temporary blocking games with Tor, just as has been done with AOL in the past. I don't know the details, apparently it needed a few fixes from someone familiar with the Wikimedia source code, but in any case it never got committed.<ref name="torblockpatch">http://www.imperialviolet.org/binary/mediawiki-1.4.4-tor-block.patch </ref><ref name="torblockpatch-ortalk">http://archives.seul.org/or/talk/May-2005/msg00128.html </ref><ref name="torblockpatch-ortalk-alreadycoded">http://archives.seul.org/or/talk/Sep-2005/msg00312.html</ref>
== Works cited == <!-- Yes, I know very well that many of these do not count as [[WP:RS|reliable sources]]. That's okay, I'm not writing an article. If, for example, you question whether Jimbo is actually the author of some of these mailing list messages, ask him. --> <-references />