On 30/07/12 15:28, Pau Giner wrote:
From the UX perspective, a captcha is always an obstacle for the interaction flow.
I agree. But when you're spammed to death if there's no captcha, you end up accepting it as a necessary evil. But don't let this pessimistic view stop you from proposing new alternatives.
Reducing the complexity of user interaction when solving the captcha can benefit all kinds of users but also solve problems for non-English speakers.
Checkbox and honeypot-based captchas avoid most of the problems of text-based captchas since interaction is simplified to the minimum for the user: http://uxmovement.com/forms/captchas-vs-spambots-why-the-checkbox-captcha-wi...
No. Those work against generic spambots. For a small site, pretty much any custom-made captcha will work. When someone designs against your captcha, you need to provide a hard test. If we were comparing against a math captcha, checkbox is more usable while only slightly weaker. None of them has a chance against a captcha designed against them.
If you run Wikipedia, bad guys will work to defeat your captcha and spam/vandalise/annoy you. If you are developing MediaWiki, a wiki used in thousands of sites [1], spammers will work to make bots capable to spam those many MediaWiki installs (cf. DantMan reply) If you are Open Source, then it's much harder to make (not only due to security by obscurity of the code, but also of the own challenges...).
1- http://www.google.com/search?q=%22powered%20by%20mediawiki%22 ~201.000.000 results
Simple questions where the user can select an answer (not type) will solve some of the input-related issues for non-English speakers. These questions can be of different kinds (e.g., "Which one does not belong to the group: Red, Green, Skateboard, Blue?", "Is fire hot or cold?") and they can be based on text or image selection. An example of image-based captcha is available at http://www.picatcha.com/captcha/
No. Those are *harder* since you need a knowledge of English language and terms.
I can fill in a text captcha in a foreign language site since its own appearance (after being trained by hundreds of sites!) shows what it is expected from me. If I go to http://www.picatcha.com/captcha/, I am asked to "Select ALL the images of «concept»". Which is fine but requires me to know what is that «concept». I might eg. think that hourglasses are a kind of spectacles (eyeglasses) and get very annoyed by not being able to pass it.
Also, making good questions is tricky. You need to produce loads of that kind of questions with their answers, if you made just a few hundreds (eg. it's done by a human), I could make a list of questions with their answer (manually solved) and spam you as many times I want.
You want to make intelligent questions hard for bots, but anyone should be able to solve them, even if they are young, uneducated or foreign. I may know that I have to rule colors out, but I don't which of skateboard vs turquoise is the color. And yet, you can't dumbify it so much that a computer will be able to answer it.
Suppose you are performing questions of type "Is X Y or Z?" and have made thousands of pairs (that you can't share!). A naive approach would just to answer Y or Z at random, accepting a 50% of failure (bots don't mind resending their requests many times, a 50% blocking captcha is broken). But we can do better, when you ask my bot "Is fire hot or cold?" it could go and search google for those concepts: * fire hot 1.210.000.000 results * fire cold 656.000.000 results
There's a very clear correlation of fire with hot rather than with cold, thus it chooses 'hot', and defeats your captcha. :)
Tagging media can be also used as a captcha. Google has been experimenting with asking users to tag videos as a captcha: http://cups.cs.cmu.edu/soups/2009/proceedings/a14-kleuver.pdf [PDF]
If we were doing this with Wikimedia Commons videos a) The video set is known, as are the descriptions. Ergo, match the video with its file and . b) IMHO having to watch a video (even if short) is *more* annoying than typing a text captcha.* c) No/poor localisation.
* This needs to be balanced with how much you want to enter the captcha-walled garden, of course. I may accept watching your CEO boasting about your service (from which you then ask me the captcha**) in exchange for a gmail-like mail account or multigigabyte dropbox storage, but not to watch one everytime I sign in!
** Don't complain if he's tagged by most users as 'boring'. :)
In any case, some experimentation would be required to determine any of the above approaches (or combination of several) provides an appropriate security-usability balance for the specific needs of the Wikipedia.
We would first need an evaluation of what is considered spam, and how to measure. If we get lots of bots the next day you enable it, it's clearly broken, but how much time would we need before being x% confident that it is secure enough, when you are just waiting some random guy to decide coding against your challenge?