Why are we still using captchas on WMF sites?

List overview All Threads
Download

newer

older

Re: [Wikitech-l] Bugzilla and...

Roadmap change summary for today -...

David Gerard

20 Jan 2013 20 Jan '13

6:22 p.m.

The MediaWiki captcha is literally worse than useless: it doesn't keep spambots out, and it does keep some humans out.

(I was just reminded of this by a friend I lured into joining Wikivoyage - who can see and is highly literate, but found the captcha really troublesome.)

Why are we still using this?

- d.

Show replies by date

David Gerard

20 Jan 20 Jan

6:44 p.m.

On 20 January 2013 21:22, David Gerard dgerard@gmail.com wrote:

...

The MediaWiki captcha is literally worse than useless: it doesn't keep spambots out, and it does keep some humans out.

Not to mention this, which delighted Tom Morris at an editathon he was running:

https://commons.wikimedia.org/wiki/File:CAPTCHA_headshits.jpg

- d.

Steven Walling

6:58 p.m.

On Sun, Jan 20, 2013 at 1:22 PM, David Gerard dgerard@gmail.com wrote:

...

The MediaWiki captcha is literally worse than useless: it doesn't keep spambots out, and it does keep some humans out.

(I was just reminded of this by a friend I lured into joining Wikivoyage - who can see and is highly literate, but found the captcha really troublesome.)

Why are we still using this?

d.

Hi David,

This question is something we've also been asking ourselves on the E3 team, as part of our work on account creation. I think we all agree that CAPTCHAs are at best a necessary evil. They are a compromise we make in our user experience, in order to combat automated attacks.

If anyone is interested in seeing how CAPTCHAs impact users first hand, you should definitely watch the remote user tests we did at https://www.mediawiki.org/wiki/Account_creation_user_experience/User_testing... at least scan the notes. Those highlighted for us that, even in the much nicer looking new version we built, the CAPTCHA is a pain in the ass.

To get more numbers on how much taking away the CAPTCHA might gain us in terms of human registrations, we have considered a two hour test (to start with) of removing the CAPTCHA from the registration page: https://meta.wikimedia.org/wiki/Research:Account_creation_UX/CAPTCHA That kind of test would probably not be an accurate measurement of what kind of spam would be unleashed if we permanently removed it, but the hourly volume of registrations on enwiki is enough to tell us the human impact.

There are other, easier things we can do to make this issue far less worse, if we can't remove it entirely without lots of rigorous testing...

One thing we recently did was simply regenerate the images with font rendering that is easier to read for people; our system for generating CAPTCHAs is arcane to my eyes, but Aaron Schulz was able to accomplish that without much headache.

The other thing in the works is adding a refresh button to the CAPTCHAs, which if done correctly could make a huge difference IMO: https://gerrit.wikimedia.org/r/#/c/44376/ That patch still needs UI improvements and testing, but any help would be most welcome.

Steven

Bawolff Bawolff

11:46 p.m.

...

This question is something we've also been asking ourselves on the E3

team,

...

as part of our work on account creation. I think we all agree that

CAPTCHAs

...

are at best a necessary evil. They are a compromise we make in our user experience, in order to combat automated attacks.

That's kind of missing the point of the original poster. The point being that they are an *un*nessary evil and do not prevent automated attacks whatsoever.

[Snip]

...

To get more numbers on how much taking away the CAPTCHA might gain us in terms of human registrations, we have considered a two hour test (to start with) of removing the CAPTCHA from the registration page: https://meta.wikimedia.org/wiki/Research:Account_creation_UX/CAPTCHA That kind of test would probably not be an accurate measurement of what kind of spam would be unleashed if we permanently removed it, but the hourly

volume

...

of registrations on enwiki is enough to tell us the human impact.

That would be interesting. Remember that captchas arent just on the user reg page though.

-bawolff

Steven Walling

21 Jan 21 Jan

2:06 a.m.

On Sun, Jan 20, 2013 at 6:46 PM, Bawolff Bawolff bawolff@gmail.com wrote:

...

...
This question is something we've also been asking ourselves on the E3

team,

...
as part of our work on account creation. I think we all agree that

CAPTCHAs

...
are at best a necessary evil. They are a compromise we make in our user experience, in order to combat automated attacks.

That's kind of missing the point of the original poster. The point being that they are an *un*nessary evil and do not prevent automated attacks whatsoever.

[Snip]

We actually don't know that, and "whatsoever" is probably a gross exaggeration.

...

...
To get more numbers on how much taking away the CAPTCHA might gain us in terms of human registrations, we have considered a two hour test (to

start

...
with) of removing the CAPTCHA from the registration page: https://meta.wikimedia.org/wiki/Research:Account_creation_UX/CAPTCHAThat kind of test would probably not be an accurate measurement of what kind

of

...
spam would be unleashed if we permanently removed it, but the hourly

volume

...
of registrations on enwiki is enough to tell us the human impact.

That would be interesting. Remember that captchas arent just on the user reg page though.

-bawolff

Yeah I would prefer we only test removal on the registration page.

...

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Thomas Bleher

2:13 a.m.

* David Gerard dgerard@gmail.com [2013-01-20 22:23]:

...

The MediaWiki captcha is literally worse than useless: it doesn't keep spambots out, and it does keep some humans out.

I can't speak for WMF sites, but on my small wiki, it *does* keep some spambots out. Not all of them, but it is still quite useful to have captchas there.

Regards, Thomas

Victor Vasiliev

2:13 a.m.

On 01/20/2013 04:22 PM, David Gerard wrote:

...

The MediaWiki captcha is literally worse than useless: it doesn't keep spambots out, and it does keep some humans out.

I don't see how the spambot statement is true. Do you have evidence for it?

Also, it's one of the nicer CAPTCHAS around here, it's certainly more readable than reCAPTCHA.

-- Victor.

David Gerard

4:48 a.m.

On 21 January 2013 05:13, Victor Vasiliev vasilvv@gmail.com wrote:

...

On 01/20/2013 04:22 PM, David Gerard wrote:

...

...
The MediaWiki captcha is literally worse than useless: it doesn't keep spambots out, and it does keep some humans out.

...

I don't see how the spambot statement is true. Do you have evidence for it?

That spambots get through at all.

- d.

Andre Klapper

4:56 a.m.

On Mon, 2013-01-21 at 07:48 +0000, David Gerard wrote:

...

On 21 January 2013 05:13, Victor Vasiliev vasilvv@gmail.com wrote:

...
On 01/20/2013 04:22 PM, David Gerard wrote:

...
...
The MediaWiki captcha is literally worse than useless: it doesn't keep spambots out, and it does keep some humans out.

...
I don't see how the spambot statement is true. Do you have evidence for it?

That spambots get through at all.

Evidence is not provided by simply repeating the statement. :)

"at all" implies that some spambots are blocked at least?

andre

-- Andre Klapper | Wikimedia Bugwrangler http://blogs.gnome.org/aklapper/

David Gerard

5 a.m.

On 21 January 2013 07:56, Andre Klapper aklapper@wikimedia.org wrote:

...

"at all" implies that some spambots are blocked at least?

Yes, but to count as successful it would have to block approximately all, I'd think.

I mean, you could redefine "something that doesn't block all spambots but does hamper a significant proportion of humans" as "successful", but it would be a redefinition.

- d.

Anthony

8:40 p.m.

On Mon, Jan 21, 2013 at 3:00 AM, David Gerard dgerard@gmail.com wrote:

...

I mean, you could redefine "something that doesn't block all spambots but does hamper a significant proportion of humans" as "successful", but it would be a redefinition.

It's not a definition, it's a judgment.

And whether or not it's a correct judgment depends on how many spambots are blocked, and how many productive individuals are "hampered", among other things.

After all, reverting spam hampers people too.

Bawolff Bawolff

9:44 p.m.

Given that there are algorithms that can solve our captcha presumably they are mostly preventing the lazy and those that don't have enough knowledge to use those algorithims. I would guess that text on an image without any blurring or manipulation would be just as hard for those sorts of people to break. (Obviously that's a rather large guess). As a compromise maybe we should have straight text in image captchas.

-bawolff On 2013-01-21 7:40 PM, "Anthony" wikimail@inbox.org wrote:

...

On Mon, Jan 21, 2013 at 3:00 AM, David Gerard dgerard@gmail.com wrote:

...
I mean, you could redefine "something that doesn't block all spambots but does hamper a significant proportion of humans" as "successful", but it would be a redefinition.

It's not a definition, it's a judgment.

And whether or not it's a correct judgment depends on how many spambots are blocked, and how many productive individuals are "hampered", among other things.

After all, reverting spam hampers people too.

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Nikola Smolenski

22 Jan 22 Jan

6:13 a.m.

On 22/01/13 01:44, Bawolff Bawolff wrote:

...

Given that there are algorithms that can solve our captcha presumably they are mostly preventing the lazy and those that don't have enough knowledge to use those algorithims. I would guess that text on an image without any blurring or manipulation would be just as hard for those sorts of people to break. (Obviously that's a rather large guess). As a compromise maybe we should have straight text in image captchas.

A simple thing that could be done is to introduce localized captchas on non-Latin wikis (just remember that not everyone has the appropriate keyboard). This would mean that new captcha-cracking algorithms would need to be developed for each script, and if the wikis are small, spammers would not bother.

MZMcBride

6:54 a.m.

Nikola Smolenski wrote:

...

On 22/01/13 01:44, Bawolff Bawolff wrote:

...
Given that there are algorithms that can solve our captcha presumably they are mostly preventing the lazy and those that don't have enough knowledge to use those algorithims. I would guess that text on an image without any blurring or manipulation would be just as hard for those sorts of people to break. (Obviously that's a rather large guess). As a compromise maybe we should have straight text in image captchas.

A simple thing that could be done is to introduce localized captchas on non-Latin wikis (just remember that not everyone has the appropriate keyboard). This would mean that new captcha-cracking algorithms would need to be developed for each script, and if the wikis are small, spammers would not bother.

Any developer interested in Wikimedia wiki CAPTCHAs should look at https://www.mediawiki.org/wiki/Requests_for_comment/CAPTCHA. There's some low-hanging bug fruit there.

MZMcBride

Andre Klapper

9:19 a.m.

On Tue, 2013-01-22 at 10:13 +0100, Nikola Smolenski wrote:

...

A simple thing that could be done is to introduce localized captchas on non-Latin wikis (just remember that not everyone has the appropriate keyboard).

For the records, this is covered in https://bugzilla.wikimedia.org/show_bug.cgi?id=5309

andre

-- Andre Klapper | Wikimedia Bugwrangler http://blogs.gnome.org/aklapper/

Matthew Flaschen

1:28 a.m.

On 01/21/2013 03:00 AM, David Gerard wrote:

...

On 21 January 2013 07:56, Andre Klapper aklapper@wikimedia.org wrote:

...
"at all" implies that some spambots are blocked at least?

Yes, but to count as successful it would have to block approximately all, I'd think.

That's dubious. Blocking all spambots is not the goal of any CAPTCHA. The goal is to significantly decrease spam, and thus to save human spam-fighters time.

Matt Flaschen

David Gerard

4:43 a.m.

On 22 January 2013 04:28, Matthew Flaschen mflaschen@wikimedia.org wrote:

...

On 01/21/2013 03:00 AM, David Gerard wrote:

...

...
Yes, but to count as successful it would have to block approximately all, I'd think.

...

That's dubious. Blocking all spambots is not the goal of any CAPTCHA. The goal is to significantly decrease spam, and thus to save human spam-fighters time.

Per the previous comments in this post, anything over 1% precision should be regarded as failure, and our Fancy Captcha was at 25% a year ago. So yeah, approximately all, and our captcha is well known to actually suck.

- s.

vitalif＠yourcmc.ru

2:37 p.m.

...

Per the previous comments in this post, anything over 1% precision should be regarded as failure, and our Fancy Captcha was at 25% a year ago. So yeah, approximately all, and our captcha is well known to actually suck.

Maybe you'll just use recaptcha instead of fancycaptcha?

David Gerard

2:43 p.m.

On 22 January 2013 17:37, vitalif@yourcmc.ru wrote:

...

...
Per the previous comments in this post, anything over 1% precision should be regarded as failure, and our Fancy Captcha was at 25% a year ago. So yeah, approximately all, and our captcha is well known to actually suck.

...

Maybe you'll just use recaptcha instead of fancycaptcha?

The problem is that reCaptcha (a) used as a service, would pass private user data to a third party (b) is closed source, so we can' t just put up our own instance. Has anyone reimplemented it or any of it? There's piles of stuff on Wikisource we could feed it, for example.

- d.

vitalif＠yourcmc.ru

2:50 p.m.

...

The problem is that reCaptcha (a) used as a service, would pass private user data to a third party (b) is closed source, so we can' t just put up our own instance. Has anyone reimplemented it or any of it? There's piles of stuff on Wikisource we could feed it, for example.

OK, then we can take KCaptcha and integrate it as an extension. It's russian project, I've used it many times. Seems to be rather strong. http://www.captcha.ru/en/kcaptcha/

Nikola Smolenski

23 Jan 23 Jan

5:05 a.m.

On 22/01/13 18:43, David Gerard wrote:

...

On 22 January 2013 17:37, vitalif@yourcmc.ru wrote:

...
...
Per the previous comments in this post, anything over 1% precision should be regarded as failure, and our Fancy Captcha was at 25% a year ago. So yeah, approximately all, and our captcha is well known to actually suck.

...
Maybe you'll just use recaptcha instead of fancycaptcha?

The problem is that reCaptcha (a) used as a service, would pass private user data to a third party (b) is closed source, so we can' t just put up our own instance. Has anyone reimplemented it or any of it? There's piles of stuff on Wikisource we could feed it, for example.

http://wikimania2012.wikimedia.org/wiki/Submissions/Wikicaptcha:_a_ReCAPTCHA...

https://github.com/CristianCantoro/wikicaptcha

I don't know what ultimately came of it.

Arthur Richards

22 Jan 22 Jan

2:45 p.m.

On Tue, Jan 22, 2013 at 10:37 AM, vitalif@yourcmc.ru wrote:

...

...
Maybe you'll just use recaptcha instead of fancycaptcha?

/me gets popcorn to watch recaptcha flame war

There has been discussion on this list in the past about the use of recaptcha, but it has generally ended in a down-vote because reCaptcha is not open source (even though it supports free culture) nor is it something we can host on our own servers.

-- Arthur Richards Software Engineer, Mobile [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687

Luke Welling WMF

2:59 p.m.

Even ignoring openness and privacy, exactly the same problems are present with reCAPTCHA as with Fancy Captcha. It's often very hard or impossible for humans to read, and is a big enough target to have been broken by various people.

I don't know if it's constructive to brainstorm solutions to a "problem" before we measure the extent of the problem, but a viable compromise is very easy captchas. Spammers vary a great deal in sophistication but if we figure that any sophisticated enough to do any OCR are capable of finding and downloading existing public exploits of ours, then a block capital impact font captcha is equally easy for them, equally difficult for unsophisticated spammers and much easier for sighted humans.

Luke

On Tue, Jan 22, 2013 at 9:45 AM, Arthur Richards arichards@wikimedia.orgwrote:

...

On Tue, Jan 22, 2013 at 10:37 AM, vitalif@yourcmc.ru wrote:

...
...
Maybe you'll just use recaptcha instead of fancycaptcha?

/me gets popcorn to watch recaptcha flame war

There has been discussion on this list in the past about the use of recaptcha, but it has generally ended in a down-vote because reCaptcha is not open source (even though it supports free culture) nor is it something we can host on our own servers.

-- Arthur Richards Software Engineer, Mobile [[User:Awjrichards]] IRC: awjr +1-415-839-6885 x6687 _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

vitalif＠yourcmc.ru

3:10 p.m.

Luke Welling WMF писал 2013-01-22 21:59:

...

Even ignoring openness and privacy, exactly the same problems are present with reCAPTCHA as with Fancy Captcha. It's often very hard or impossible for humans to read, and is a big enough target to have been broken by various people.

It's very good to discuss, but what are the other options to minimize spam?

vitalif＠yourcmc.ru

3:16 p.m.

...

It's very good to discuss, but what are the other options to minimize spam?

(maybe I know one: find XRumer authors and tear their arms off... :-))

Federico Leva (Nemo)

3:16 p.m.

Luke, sorry for reiterating, but «brainstorm solutions to a "problem" before we measure the extent of the problem» is wrong: it's already been measured by others, see the other posts...

Nemo

Luke Welling WMF

4:18 p.m.

That was not the end of the problem I was referring to. We know our specific captcha is broken at turning away machines. As far as I am aware we do not know how many humans are being turned away by the difficulty of it. It's a safe bet that it is non-zero given the manual account requests we get, but given that we have people to do those kinds of experiments it would make sense to get a number from them before making any drastic decisions based on a reasonable gut feeling. I don't think anybody claims to have a perfect solution to the spam vs usability balancing act, so it's possible we'll try (and measure) a few approaches.

Luke

On Tue, Jan 22, 2013 at 10:16 AM, Federico Leva (Nemo) nemowiki@gmail.comwrote:

...

Luke, sorry for reiterating, but «brainstorm solutions to a "problem" before we measure the extent of the problem» is wrong: it's already been measured by others, see the other posts...

Nemo

______________________________**_________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikitech-l https://lists.wikimedia.org/mailman/listinfo/wikitech-l

aude

4:29 p.m.

On Tue, Jan 22, 2013 at 8:18 PM, Luke Welling WMF lwelling@wikimedia.orgwrote:

...

That was not the end of the problem I was referring to. We know our specific captcha is broken at turning away machines. As far as I am aware we do not know how many humans are being turned away by the difficulty of it.

It's at least impossible for blind users to solve the captcha, without an audio captcha. (unless they manage to find the toolserver account creation thing and enough motivated to do that)

I am not convinced of the benefits of captcha versus other spam filtering techniques.

Cheers, Katie

...

It's a safe bet that it is non-zero given the manual account requests we get, but given that we have people to do those kinds of experiments it would make sense to get a number from them before making any drastic decisions based on a reasonable gut feeling. I don't think anybody claims to have a perfect solution to the spam vs usability balancing act, so it's possible we'll try (and measure) a few approaches.

Luke

On Tue, Jan 22, 2013 at 10:16 AM, Federico Leva (Nemo) nemowiki@gmail.comwrote:

...
Luke, sorry for reiterating, but «brainstorm solutions to a "problem" before we measure the extent of the problem» is wrong: it's already been measured by others, see the other posts...

Nemo

______________________________**_________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikitech-l<

https://lists.wikimedia.org/mailman/listinfo/wikitech-l%3E

...

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

-- @wikimediadc / @wikidata

Bawolff Bawolff

4:53 p.m.

On 2013-01-22 3:30 PM, "aude" aude.wiki@gmail.com wrote:

...

On Tue, Jan 22, 2013 at 8:18 PM, Luke Welling WMF <lwelling@wikimedia.org wrote:

...
That was not the end of the problem I was referring to. We know our specific captcha is broken at turning away machines. As far as I am

aware

...

...
we do not know how many humans are being turned away by the difficulty

...

...
it.

It's at least impossible for blind users to solve the captcha, without an audio captcha. (unless they manage to find the toolserver account

creation

...

thing and enough motivated to do that)

I am not convinced of the benefits of captcha versus other spam filtering techniques.

Cheers, Katie

Someone should write a browser addon to automatically decode and fill in captchas for blind users. (Only half joking)

-bawolff ______________________________**_________________

...

...
...
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikitech-l<

https://lists.wikimedia.org/mailman/listinfo/wikitech-l%3E

...

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

-- @wikimediadc / @wikidata _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Chris Grant

23 Jan 23 Jan

12:33 a.m.

On Wed, Jan 23, 2013 at 3:53 AM, Bawolff Bawolff bawolff@gmail.com wrote:

...

Someone should write a browser addon to automatically decode and fill in captchas for blind users. (Only half joking)

Don't joke, I have a blind relative who's screen reader does just that (simple captchas only).

There are other services like http://www.azavia.com/zcaptcha which is specifically for the blind, but hell its probably cheaper to use the same captcha reading services that the spammers do.

-- Chris

On Wed, Jan 23, 2013 at 3:53 AM, Bawolff Bawolff bawolff@gmail.com wrote:

...

On 2013-01-22 3:30 PM, "aude" aude.wiki@gmail.com wrote:

...
On Tue, Jan 22, 2013 at 8:18 PM, Luke Welling WMF <

lwelling@wikimedia.org

...
wrote:

...
That was not the end of the problem I was referring to. We know our specific captcha is broken at turning away machines. As far as I am

aware

...
...
we do not know how many humans are being turned away by the difficulty

of

...
...
it.

It's at least impossible for blind users to solve the captcha, without an audio captcha. (unless they manage to find the toolserver account

creation

...
thing and enough motivated to do that)

I am not convinced of the benefits of captcha versus other spam filtering techniques.

Cheers, Katie

Someone should write a browser addon to automatically decode and fill in captchas for blind users. (Only half joking)

-bawolff ______________________________**_________________

...
...
...
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikitech-l<

https://lists.wikimedia.org/mailman/listinfo/wikitech-l%3E

...

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

-- @wikimediadc / @wikidata _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Federico Leva (Nemo)

5:24 a.m.

Luke, "we do not know how many humans are being turned away by the difficulty": actually we sort of do, that paper tells this as well. It's where their study came from, and gives recommendations on what captcha techniques are best for balancing efficacy with difficulty for humans. We don't seem to be following any (except waving, which, they say, shouldn't be used alone). Then, I'm not qualified to say if their recommendations are the best and I've not searched other studies, but it's not correct to say that we start from zero or that we have to study by ourselves (an unreasonable requirement that implies we'll never change anything until we'll be forced to make our wikis read-only due to spam, as many MediaWiki users before us).

Nemo

Luke Welling WMF

1:55 p.m.

I don't know if we are talking at cross purposes, or if I missed it, but this paper: http://elie.im/publication/text-based-captcha-strengths-and-weaknesses does not try to answer my question.

What I want to know is "*How many humans get turned away from editing Wikipedia by a difficult captcha?*"

The same authors have: http://elie.im/publication/how-good-are-humans-at-solving-captchas-a-large-s... which is closer to what I want to know. They show humans solving different text based captures with an accuracy rate of 70% to 98%. Unfortunately, Wikipedia was not one of the captcha schemes they used in that study, and they don't attempt to measure how many people try again if they fail.

If 2% of people fail on the first try but 90% of the fails reattempt and only 1% fail a second time that's an inconvenience, but probably worth it if it reduces the inconvenience of spam.

If 30% of people fail on the first try and 90% of them give up and never try to edit again, that's a disaster.

Luke

On Wed, Jan 23, 2013 at 12:24 AM, Federico Leva (Nemo) nemowiki@gmail.comwrote:

...

Luke, "we do not know how many humans are being turned away by the difficulty": actually we sort of do, that paper tells this as well. It's where their study came from, and gives recommendations on what captcha techniques are best for balancing efficacy with difficulty for humans. We don't seem to be following any (except waving, which, they say, shouldn't be used alone). Then, I'm not qualified to say if their recommendations are the best and I've not searched other studies, but it's not correct to say that we start from zero or that we have to study by ourselves (an unreasonable requirement that implies we'll never change anything until we'll be forced to make our wikis read-only due to spam, as many MediaWiki users before us).

Nemo

______________________________**_________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/wikitech-l https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Helder .

24 Jan 24 Jan

9:59 a.m.

On Tue, Jan 22, 2013 at 5:18 PM, Luke Welling WMF lwelling@wikimedia.org wrote:

...

That was not the end of the problem I was referring to. We know our specific captcha is broken at turning away machines. As far as I am aware we do not know how many humans are being turned away by the difficulty of it. It's a safe bet that it is non-zero given the manual account requests we get, but given that we have people to do those kinds of experiments it would make sense to get a number from them before making any drastic decisions based on a reasonable gut feeling. I don't think anybody claims to have a perfect solution to the spam vs usability balancing act, so it's possible we'll try (and measure) a few approaches.

I think the impact on humans will be mensurable once actions which trigger a CAPTCHA are logged: https://bugzilla.wikimedia.org/show_bug.cgi?id=41522 https://gerrit.wikimedia.org/r/#/c/40553/

Chris Grant

21 Jan 21 Jan

5:04 a.m.

Not sure about enwiki, but from my experience with hosting smaller wiki's CAPTCHA's are pretty useless (reCAPTCHA, FancyCAPTCHA, some custom ones).

The spambots keep on flooding through.

I've found its much more effective to just use the AbuseFilter.

-- Chris

Isarra Yos

22 Jan 22 Jan

2:59 p.m.

On 21/01/13 08:04, Chris Grant wrote:

...

Not sure about enwiki, but from my experience with hosting smaller wiki's CAPTCHA's are pretty useless (reCAPTCHA, FancyCAPTCHA, some custom ones).

The spambots keep on flooding through.

I've found its much more effective to just use the AbuseFilter.

-- Chris _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l

I've had a similar experience with what are probably similar projects. Abusefilter and phalanx do a lot more to actually block unwanted stuff than a simple captcha, and generally without disrupting good faith edits.

Also have the option to give the users more information about what to do in fuzzier cases, and why they need to... it's rather helpful when set up properly.

-- -— Isarra

Bawolff Bawolff

21 Jan 21 Jan

5:43 a.m.

On 2013-01-21 3:56 AM, "Andre Klapper" aklapper@wikimedia.org wrote:

...

On Mon, 2013-01-21 at 07:48 +0000, David Gerard wrote:

...
On 21 January 2013 05:13, Victor Vasiliev vasilvv@gmail.com wrote:

...
On 01/20/2013 04:22 PM, David Gerard wrote:

...
...
The MediaWiki captcha is literally worse than useless: it doesn't

keep

...

...
...
...
spambots out, and it does keep some humans out.

...
I don't see how the spambot statement is true. Do you have evidence

for it?

...

...
That spambots get through at all.

Evidence is not provided by simply repeating the statement. :)

Does http://elie.im/publication/text-based-captcha-strengths-and-weaknessescount as evidence? (Copied and pasted from the mailing list archives)

Sure captchas do prevent some limitted attacks - it makes it more effort then a 5 minute perl script. Most spammers are more sophisticated than that.

-bawolff

David Gerard

5:45 a.m.

On 21 January 2013 08:43, Bawolff Bawolff bawolff@gmail.com wrote:

...

Does http://elie.im/publication/text-based-captcha-strengths-and-weaknessescount as evidence? (Copied and pasted from the mailing list archives)

404 :-) Correct link: http://elie.im/publication/text-based-captcha-strengths-and-weaknesses

- d.

Federico Leva (Nemo)

8:30 a.m.

And to be more explicit, quoting the paper in case someone really has doubts: «we deem a captcha scheme broken when the attacker is able to reach a precision of at least 1%». With our FancyCaptcha we are/were at 25 % precision for attackers, so yes, it's officially broken, and it's been so since more than a year ago http://thread.gmane.org/gmane.science.linguistics.wikipedia.technical/56387 so surely spammers got better in the meanwhile.

Nemo

4361

Age (days ago)

4365

Last active (days ago)

wikitech-l@lists.wikimedia.org

37 comments

18 participants

tags (0)

participants (18)

Andre Klapper
Anthony
Arthur Richards
aude
Bawolff Bawolff
Chris Grant
David Gerard
Federico Leva (Nemo)
Helder .
Isarra Yos
Luke Welling WMF
Matthew Flaschen
MZMcBride
Nikola Smolenski
Steven Walling
Thomas Bleher
Victor Vasiliev
vitalif＠yourcmc.ru