On 17 May 2010 20:05, Platonides Platonides@gmail.com wrote:
Tim Starling wrote:
Audio CAPTCHAs, like visual CAPTCHAs, are not accessible for all people and do not conform to W3C accessibility guidelines. What's more, they're easier to crack than visual CAPTCHAs due to their one-dimensional nature. This is especially true if you use a public source dictionary of spoken phrases, against which an FFT correlation can be run.
Just as with image captchas, you'd need to introduce noise into it.
If you are working from known constituents, you can use cross-correlation to ignore noise pretty effectively (I believe it's what humans do). The choice then is either to make the noise sound like the captcha's numbers (google's approach), which is very hard to solve (at least I find it so), or to use ReCAPTCHAs vast database of unknown sound files (with noise added to obscure the phonemes). The human brain is capable of filling in completely obscured phonemes in order to make the sentence "make sense" (assuming they speak the language in question - another usability problem with these), something that computers are not yet so good at.
It's likely to be much easier to improve the "request an account from a human" process - which has inbuilt rate-limiting, a little bit of turing test, and a nice splash of common sense that is so hard to instill in an automated system. (Alternatively we could just implement an insecure audio captcha, "safe" in the knowledge that no-one has enough motivation to crack it - I imagine the implementation would still take significant effort)
I have been trying flite, and didn't find the synthesized text too understable by itself. :(
In which case a computer could probably solve them better than you :).
Conrad