Benjamin Lees <emufarmers(a)gmail.com> wrote:
> Is there an actual problem you're trying to
solve here? Is there any
indication that spam bots are affecting your site's
performance? If not,
worrying about this is probably a waste of your time.
Spambots and CPU is a known issue:
https://www.google.com/search?q=spam+bots+cpu&ie=utf-8&oe=utf-8&…
Is it a problem? Yes, they're constantly trying to break in and that
increases CPU usage. I dont have any analysis to prove it but I've seen
many times where traffic has been normal (google analytics) but the CPU
usage has gone very high. I came from a shared server where I was actually
asked to leave because of the CPU usage. I've had big problems with CPU.
I'm on VPS now and it can still be a problem. Average CPU usage recently
went up from around 20% to 160% (multi-core, thats why it goes over 100 or
some other reason) for a few hours, while Google analytics showed no
change. Whether this is a malicious/ddos bot or an advertising bot, this is
something that needs to be studied and dealt with. If I stay on 200% CPU
usage on the VPS I may be asked to leave the server. So yes I have to keep
a watch over CPU and I have to explore all possible options to keep the
usage down. I'm using caching and nginx (earlier suggestions by people on
this list).
As to how to prevent genuine viewers from being blocked, thats problem #2
and its something that can be improved.
I'll try this suggested by Henny:
http://danielwebb.us/software/bot-trap/
Anne wrote:
> +1 - there is one well-known blog site that uses
capture, and I've tried
as many as 10 times on a single submission, only to give
up because I
simply couldn't get the captcha right. Now I don't even try to comment
there.
They need a better captcha. But yes, you guys have reminded me that
whatever method is used, I need to make sure genuine visitors are not
effected. The link by Henny might take care of it as the 'hidden' link is
not seen by humans.
thanks
Dan
On Fri, Mar 29, 2013 at 1:29 PM, Benjamin Lees <emufarmers(a)gmail.com> wrote:
On Thu, Mar 28, 2013 at 11:32 PM, Dan Fisher
<danfisher261(a)gmail.com>
wrote:
Here's one idea: If a
certain IP address fails the captchas a specified number of times in 5
minutes or so, it should be banned temporarily for say, 24 hours (through
htaccess or firewall etc).
Humans regularly get CAPTCHAs wrong, and they often do so multiple times
(if you have any elderly relatives, feel free to see how many tries it
takes them to solve a reCAPTCHA one). Blocking them from even viewing your
site for a day seems a little extreme.
Is there an actual problem you're trying to solve here? Is there any
indication that spam bots are affecting your site's performance? If not,
worrying about this is probably a waste of your time.
_______________________________________________
MediaWiki-l mailing list
MediaWiki-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-l