Hi,
On Tue, Feb 16, 2010 at 8:31 PM, Domas Mituzas <midom.lists(a)gmail.com> wrote:
You can sure assume, that we need to come up with
something to "defend a new policy".
Yeah, ban no/broken-UA clients for
these things that do cause CPU
load, but leave article reading unharmed. Normal readers with Privoxy
or other privacy filters (you know, people DO still use them, even if
their percentage is small!) can at least READ, then.
Presumably
some percentage of that 20-50% will come back as the
spammers realize they have to supply the string. Presumably we
then start playing whack-a-mole.
Yes, we will ban all IPs participating in this.
Good luck fighting a dynamic bot
herder (though I do ask me, with the
spam blacklist and the captchas for URLs, what the hell can a botnet
master achieve by hitting Wikipedia?!).
Presumably
there's a plan for what to do when the spammers begin
supplying a new, random string every time.
Random strings are easy to identify, fixed strings are easy to verify.
The point
is, what should bot writers do:
1) no UA at all, that's the typical newbie mistake who just supplies
GET /w/index.php?action=edit, which works with his localhost wiki and
every other wegs.
2) default UA of the programming language (PHPs thingy, cURL, Python,
some bots may even use wget and bash scripting, it's not THAT
difficult to write a Wikibot in bashscript!)
3) own UA (stuff like "HDBot v1.1 (
http://xyz.tld)"quot;, which I couldn't
use some longer time ago)
4) spoof a browser UA (bad, as the site cant differ between bot and browser)
To avoid the ban, only 3 and 4 are possible, as the default UAs are
blocked for most cases. But as 3 not really works, or at least is hard
to troubleshoot, it leaves only 4, which you do not want.
Please write some doc that answers this once and for all.
Marco
PS: Oh, and please, please make the 403 msg something that people can
figure out what's wrong, it takes AGES if you are a newbie to
scripting.
--
VMSoft GbR
Nabburger Str. 15
81737 München
Geschäftsführer: Marco Schuster, Volker Hemmert
http://vmsoft-gbr.de