On Fri, Oct 5, 2012 at 1:14 PM, Tomasz Finc tfinc@wikimedia.org wrote:
On Fri, Oct 5, 2012 at 12:32 PM, Steven Walling swalling@wikimedia.org wrote:
I don't know how they do that, but would be willing to consider it, contingent on input from Chris Steipp and Ryan Lane.
My knowledge of this is circa 2009-2011 when I designed and implemented it with Arthur so i'm cc'ing Katie to correct anything that I get wrong. The extension has an internal engine that combines rule sets with a weighted threshold over *when* a user should see the captcha. The system is able to keep track of how many people are seeing captchas and it was/is actively monitored to correct any issues that cause it to go up. In a lot ways think of it in a similar vain as Abuse filter.
That's pretty much accurate, although it would require some work to abstract out of the fundraising context. The system that I put in place for the 2010 fundraiser to help prevent fraudulent credit card transactions was a multi-layered and modular system that used a number of heuristic pieces to determine the likeliehood that an attempted transaction was fraudulent. We relied primarily on third party service offered by MaxMind called 'minfraud' that looked at various pieces of information submitted (like a user's IP address), and it would return a 'fraud score', which mapped to the likeliehood that the transaction was fraudulent. We played around with a few ideas to build on top of that likeliehood (eg IP address velocity - how many times is the same address attempting to make a transaction in a given timeframe).
The system allowed (maybe still allows?) for different actions to be taken depending on the resulting fraud score. For the 2010 fundraiser, we initially set it up so that a vast majority of transactions would never see any sort of 'challenge' (eg captcha). Then there was a second tier of fraud scores that would trigger a 'challenge' (we were using reCaptcha). The third tier was an outright rejection. Ultimately, we wound up entirely disabling captchas because a relatively small amount of users fell into the 'challenge' bucket who seemed to be fraudsters and there was a huge outcry over a) our use of reCaptcha b) use of a captcha system that was not particularly accessible outside of Western/latin character contexts c) a captcha system at all.
I suspect something similar could be implemented here. But I will warn you that cpatcha is a huge can of worms. At the time, I did a lot of research of captcha effectiveness and the only captcha solution that seemed remotely effective was reCaptcha (none of the existing captcha solutions packaged in MediaWiki extensions even came close to reCaptcha's effectiveness). We used it, but not without an awful lot of fuss (because it is not an open source solution). Beyond that though, there are many legitimate concerns around the effectiveness of captcha in general (even reCaptcha has reportedly been 'cracked') - google around for more info if you ever have trouble falling asleep or are just generally curious about it.