On 17 May 2010 20:05, Platonides <Platonides(a)gmail.com> wrote:
Tim Starling wrote:
Audio CAPTCHAs, like visual CAPTCHAs, are not
accessible for all
people and do not conform to W3C accessibility guidelines. What's
more, they're easier to crack than visual CAPTCHAs due to their
one-dimensional nature. This is especially true if you use a public
source dictionary of spoken phrases, against which an FFT correlation
can be run.
Just as with image captchas, you'd need to introduce noise into it.
If you are working from known constituents, you can use
cross-correlation to ignore noise pretty effectively (I believe it's
what humans do). The choice then is either to make the noise sound
like the captcha's numbers (google's approach), which is very hard to
solve (at least I find it so), or to use ReCAPTCHAs vast database of
unknown sound files (with noise added to obscure the phonemes). The
human brain is capable of filling in completely obscured phonemes in
order to make the sentence "make sense" (assuming they speak the
language in question - another usability problem with these),
something that computers are not yet so good at.
It's likely to be much easier to improve the "request an account from
a human" process - which has inbuilt rate-limiting, a little bit of
turing test, and a nice splash of common sense that is so hard to
instill in an automated system. (Alternatively we could just implement
an insecure audio captcha, "safe" in the knowledge that no-one has
enough motivation to crack it - I imagine the implementation would
still take significant effort)
I have been trying flite, and didn't find the
synthesized text too
understable by itself. :(
In which case a computer could probably solve them better than you :).
Conrad