On 20 February 2010 23:00, Ævar Arnfjörð Bjarmason avarab@gmail.com wrote:
On Thu, Feb 4, 2010 at 14:37, Aryeh Gregor Simetrical+wikilist@gmail.com wrote:
On Wed, Feb 3, 2010 at 5:11 PM, Trevor Parscal tparscal@wikimedia.org wrote:
Are the stats setup to differentiate between real ie6 users and bing autosurfing?
I'd be pretty surprised if Bing is generating enough traffic to noticeably affect the percentage, even if it does get counted as IE6.
Bing can hit you pretty hard: http://blogs.perl.org/users/cpan_testers/2010/01/msnbot-must-die.html
Well.. is not a crawler, ... it seems a cracracracracracracracracrawler, for the way repeat the same request N times. It act not like a single crawler, but like a multiple list of crawler with not intercomunication all from the same range of ip's. A single optimization would be for all these crawlers to share the robots.txt file (is not this obvious?). Since that request is not shared, you see all instances making separate requests. Theres also not sincronization, so all the crawlers can hit you site at the same time, say... 15 asking robot.txt at once .. or spread 2 hours, is just luck.
It seems a .. simplistic and brute approach to internet indexing.. :-/ It seems Microsoft is dropping money on the problem, but not brains.