On Monday 27 June 2005 11:14, Andrew Gray wrote:
On 25/06/05, Jake Waskett jake@waskett.org wrote:
Of those, only 20 have proxy or cache in the name.
Thoughts on how useful this sort of data would be, given the reasonably sized sample above?
Ok, so of 126 addresses, we have about 20 proxies. So about 16% of anonymous Wikipedias users are recognised as being behind a proxy, using this scheme. I don't know the answer to this question, but does anybody know roughly what proportion of web users go through a proxy server? Is it close to 16%? If so, we've got a pretty good scheme here.
I've spoken to a friend working at one of the larger ISPs; the answer is that it varies quite a bit. Only a minority of ISPs use them, but those tend to be large ISPs (the canonical example is all the AOL proxies you see around).
The upside is that most people are pragmatic, and call their proxy servers things like "proxy-43765". So it looks like this is a fairly effective way of identifying *most* proxies.
[He notes that there's also a "forwarded" header through most ISP caches, which contains the "original" originating IP; I don't know if this is accessible in this context or not, but it's useful to know it exists]
[I also did another test on a larger sample - this brought it down to ~10% having "proxy" or "cache" in them. I may do further as resources and tuits permit.]
Seems a shame that Wikipedia (rightly) doesn't allow original research. This is very interesting reading. :-)
I seem to remember that Wikipedia had it's millionth edit (or something like that) not long ago. 10-20% might not seem much, but it helps put it in perspective.
Of course, a determined user could create a sub-domain with 'proxy' or 'cache' in the title, which would fool a simple software implementation, but perhaps not a human.
In reply to geni's comment, we're talking about a minor change to the software anyway, so all that's needed is to present the admin with this information at the time that he or she chooses to block a user.
Ideally, the software could give the admin a "no IP block" option, to exercise at his or her discretion (the software may already do this; I don't know).
I'm not an admin, so can't really comment how the process actually works. Can I just check I have the mechanism right here? User:XYZ goes and vandalises an article; an admin bans them; the system then automatically slaps a short ban on the associated IP address, to prevent them logging out and trying again?
I'm not an admin either, so at the risk of this becoming the "uninformed users speculate about admins thread", let me offer my 2c.
My *understanding* is that the IP blocks (aka autoblocks) are added by the system at a later time, for exactly the reason you suggest. However, their implementation is very odd indeed. Instead of expiring when the original block did, they add the duration of the block to the time that the IP concerned last accessed Wikipedia (as opposed to the last attempt to edit). As I once discovered when legitimately blocked for a 3RR violation, this has the consequence that merely refreshing the list of currently blocked users to check whether the block has expired will keep you blocked indefinitely.
This shouldn't be a a problem, however. The system must be storing the last IP used by a user, since this autoblock-on-access mechanism cannot operate without that data, so it can easily be checked at the time of an admin setting a block.
It looks like in 80%+ of cases, telling people what the IP resolves to won't make any difference; it'll just be extra noise (with some occasional amusement, as when you notice a .gov domain). How does this sound -
Logical.
a) Admin goes to block a user. System does a check on IP address, resolves it to 473a.residence.some.edu, doesn't flag it as a proxy, keeps quiet, IP blocked.
b) Admin goes to block a user. System does a check on IP address, resolves it to usercache.admin.some.edu, and flags it because it contains *cache*. Puts up a signal to the user - "The associated IP address identifies as USERCACHE.admin.some.edu, and blocking it may affect multiple people. Do you wish to block it anyway?". Admin makes the call.
Again, logical. We'd need to have a list of words to scan for, but this is easy enough and the load on the server minimal.
In this case, I think it would be useful for an admin to have the facility to set a user block but prevent autoblocks from being applied. This just means setting a flag in the block table. As I explained before, there are other ways of achieving proxy-friendly autoblock-equivalents, but that might be too complicated.
This would leave us with the functionality we have now, but give an option for a simple override when it's likely the IP address isn't "personal". The fact that the display only comes up when it contains one of the keywords means that the privacy implications are low - and if you want it trimmed further, you can have it say that "...identifies as USERCACHE.admin.*.*" or the like. It also limits the amount of time wasted by admins, since it seems to be the case that without one of the keywords, in most cases, a cache/proxy server won't be apparent from the address alone.
Thoughts?
Seems entirely logical to me. It would be nice to hear from somebody who *is* an admin, and can comment on that basis. How would such a facility affect you people?