Re: [Wikitech-l] Captcha readibility

8 Oct 2007


      On 10/7/07, Simetrical Simetrical+wikilist@gmail.com wrote:
...
On 10/7/07, Platonides Platonides@gmail.com wrote:
...
It has been discussed here before about the captchas which are too hard
to pass. However, without samples.
Today i found one of these captchas. I read ghooktrust but mediawiki
didn't agree. The first letter could be a 5, but we don't use numbers.
So i now finally noticed it might be an s
Well, the captcha always consists of two words concatenated together,
I do believe.  "Shook" is a rather obscure word, however.  Perhaps the
dictionary could be made less comprehensive.  Although that brings us
back to non-English speakers, who won't be helped at all.
It could be either, yes, looking at it.  But if you refresh it gives
you a different captcha, right?
We should change to random characters: Using dictionary words, even a
'secret' dictionary, substantially reduces the entropy of the
captchas.  Yes, the dictionary makes the captcha easier for humans but
it's an even bigger help to computers which can fit much more accurate
state transition models in their memory.
The goal of the captcha should be to maximize the gap between humans
and computers, the goal should not be to be maximally hard.
Right now our captcha is weak by standard wisdom: the characters are
too easily segmented.  A tuned copy of the tesseract 2.0 OCR without
any statistical modeling can recognize about 25% the letters in most
of the Wikimedia captchas. Thats still pretty far from cracking it,
but I bet someone skilled at captcha cracking wouldn't have too hard a
time.
The captcha generator is a really simple python script that is easy
and fun to modify.  I made a copy here that distorts the text less but
packs the characters closer together and adds a wiggly connecting line
which is popular these days.  The result is easier to read, making the
use of mostly random characters acceptable and it completely defeats
tessearct ... but I can't prove that it's not massively less secure
against some other attack so I haven't proposed that we use it. :(

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] Captcha readibility