Minh Nguyen wrote:
Speaking of anglocentrism, I wonder if it'd be
possible for the current
captcha software to generate captchas in other languages. I've seen some
generated captchas at the Vietnamese Wikipedia that would definitely
confuse Vietnamese-speakers (can't remember the words exactly), because
of things like r's and n's smooshed up right next to each other and
stuff. The user might have to /guess/ because the English words really
don't follow Vietnamese spelling rules.
An advantage to localizing the captchas would be that it might reduce
the impact of spambots at non-English projects. As far as I know, there
isn't yet a captcha-defeating bot that understands Vietnamese or Basque
or Quechua.
By the way, I'm only proposing localizing for most languages that use
the Latin alphabet, because requiring users to respond to a captcha in
Thai or Arabic would exclude a lot of legitimate interwiki users. And
users of other scripts tend to have the means of entering in Latin-based
characters. Also, for languages that use diacritical marks, we could
generate the words without the marks and modify
[[MediaWiki:Captcha-createaccount]], asking the user to enter in the
word without diacritical marks of any kind.
Indeed it would be possible: what would be needed would be a word list
of about a thousand short words, for each language that needed its own
captchas, since the captcha software uses these to build its challenge
strings. It _might_ be possible to start with a set of common words in
English, and to use Wiktionary to choose the nearest equivalents in each
language.
However, I also think that it would be a good idea to have captchas in
non-Latin scripts as well: presumably many Arabic or Thai readers have
the same problems recognizing Latin characters that readers of Latin
scripts would have with Arabic or Thai characters. We could always offer
a Latin-script alternative as a fallback.
-- Neil