Tim Starling wrote:
Audio CAPTCHAs, like visual CAPTCHAs, are not accessible for all people and do not conform to W3C accessibility guidelines. What's more, they're easier to crack than visual CAPTCHAs due to their one-dimensional nature. This is especially true if you use a public source dictionary of spoken phrases, against which an FFT correlation can be run.
Just as with image captchas, you'd need to introduce noise into it.
I have been trying flite, and didn't find the synthesized text too understable by itself. :(