I periodically experience a DDoS attack from Microsoft. It appears to be their search engines, although I guess a bot network could be messing with reverse DNS. The attacks come from names like "msnbot-nn-nn-nn-nn.search.msn.com", where "nn" are byte values in the IP address. There will be a dozen or more crawling my site at the same time.
The symptom is that these guys are so hot and heavy that the number of httpd instances shoots through the roof to the point that none of them get serviced before timing out.
I've complained to Microsoft, but of course, received no answer.
So why am I complaining here? The logs show that this is only happening to MediaWiki sites I host -- other, simpler sites don't seem to act like a tar pit.
I've tried adding ipfilter(8) blocks, but then they just pop up on some other subnet. Also, I don't want to block legit traffic coming from Microsoft. I also don't want to stop spidering via "robots.txt" because I want well-behaved search engines like Google to have access.
Anyone else seen this aggressive crawling of their wiki sites? Any ideas for fixing it?
Thanks for any advice offered!
---------------- :::: The way you see people is the way you treat them. -- Zig Ziglar :::: Jan Steinman, EcoReality Co-op ::::
have you identified the user-agent that they are using?
On Mon, Apr 1, 2013 at 11:45 AM, Jan Steinman Jan@bytesmiths.com wrote:
I periodically experience a DDoS attack from Microsoft. It appears to be their search engines, although I guess a bot network could be messing with reverse DNS. The attacks come from names like " msnbot-nn-nn-nn-nn.search.msn.com", where "nn" are byte values in the IP address. There will be a dozen or more crawling my site at the same time.
The symptom is that these guys are so hot and heavy that the number of httpd instances shoots through the roof to the point that none of them get serviced before timing out.
I've complained to Microsoft, but of course, received no answer.
So why am I complaining here? The logs show that this is only happening to MediaWiki sites I host -- other, simpler sites don't seem to act like a tar pit.
I've tried adding ipfilter(8) blocks, but then they just pop up on some other subnet. Also, I don't want to block legit traffic coming from Microsoft. I also don't want to stop spidering via "robots.txt" because I want well-behaved search engines like Google to have access.
Anyone else seen this aggressive crawling of their wiki sites? Any ideas for fixing it?
Thanks for any advice offered!
:::: The way you see people is the way you treat them. -- Zig Ziglar :::: Jan Steinman, EcoReality Co-op ::::
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
That's probably bingbot, it's aggressive on our wiki too.
A few days I added the following to robots.txt:
User-agent: bingbot Crawl-delay: 1
I havent checked yet if it does make a difference.
Also, have a look at http://www.bing.com/webmaster/help/how-to-report-an-issue-with-bingbot-25c19...
Greetings Stip
Am 01.04.2013 17:45, schrieb Jan Steinman:
I periodically experience a DDoS attack from Microsoft. It appears to be their search engines, although I guess a bot network could be messing with reverse DNS. The attacks come from names like "msnbot-nn-nn-nn-nn.search.msn.com", where "nn" are byte values in the IP address. There will be a dozen or more crawling my site at the same time.
The symptom is that these guys are so hot and heavy that the number of httpd instances shoots through the roof to the point that none of them get serviced before timing out.
I've complained to Microsoft, but of course, received no answer.
So why am I complaining here? The logs show that this is only happening to MediaWiki sites I host -- other, simpler sites don't seem to act like a tar pit.
I've tried adding ipfilter(8) blocks, but then they just pop up on some other subnet. Also, I don't want to block legit traffic coming from Microsoft. I also don't want to stop spidering via "robots.txt" because I want well-behaved search engines like Google to have access.
Anyone else seen this aggressive crawling of their wiki sites? Any ideas for fixing it?
Thanks for any advice offered!
:::: The way you see people is the way you treat them. -- Zig Ziglar :::: Jan Steinman, EcoReality Co-op ::::
MediaWiki-l mailing list MediaWiki-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-l
mediawiki-l@lists.wikimedia.org