On 24/07/12 16:11, matanya wrote:
As for the last few month the spam rate stewards deal with is raising.
Can you provide references? What is the basis of the spam/work to do? Maybe we could make their lives easier through creating a new tool, or better anti-spam measures.
I suggest we implement a new mechanism:
Instead of giving the user a CAPTCHA to solve, give him a image from commons and ask him to add a brief description in his own language.
We can give him two images, one with known description, and the other with unknown, after enough users translate the unknown in the same why, we can use it as a verified translation. We base on the known image description to allow the user to create the account.
What if the known image isn't described the same way? Even assuming that we provide them the English translation (so that they know what it is, eg. it's not a "house" but the Royal Palace of XYZ!), and that all our users understand English good enough for making a translation. Not all translatoins will be the same. Supose we get these different results: Fairytale graphic illustration, House of the grandmother of Little Red Riding Hood, House of Little Red Riding Hood granny, Picture of Perrault fairytale about Little Red Riding Hood, Image of Little Red Riding Hood story from Grimms' Fairy Tales.
It wouldn't be that bad to have differing _proposed descriptions_. But not so for the check-description, when you would need to guess how it was translated previously by other (even if all translations were fair and accurate, with no misspellings at all).
Is it possible to embed a file from commons in the login page? is it possible to parse the entered text and store it?
Yes and yes (dedicating some efforts to make it happen, of course).
benefits:
A) it would be harder for bots to create automated accounts.
B) We will get translations to many languages with little effort from the users signing up.
What do you think?
I agree with (B) in that we would get many translations (although probably low-quality ones). I am not so sure about (A). If the accounts are being created by bot, the captcha should be changed to stop it and/or new mechanisms (such as throttles) created. If they are handmade, I see little difference from a spammer POV. Making up a description is harder than typing a word, but we would need to dumb the process, so not a big difference. And in little time they would learn how to game the system.
As for moving it forward, I think the learn from entered values should be done in a generic way, and then the "recaptcha" proposal for helping wikisource implemented. Your idea could be added later on (I see those flaws, though).
Regards