On 1/23/09 2:36 AM, Andre Engels wrote:
Two questions:
- Why is this User Agent getting this response? If I remember
correctly, this was installed in the early days of the pywikipediabot, when Brion wanted to block it because it had a programming error causing it to fetch each page twice (sometimes even more?). If that is the actual reason, I see no reason why it should still be active years afterward...
This has nothing to do with pywikipediabot.
We too frequently encountered poorly-written bots and site-scrapers which slammed the servers too hard and caused problems. Blocking default UAs of common libraries cut these incidents down dramatically, and helps encourage thoughtful bot writers to put specific information into their user-agent string, making it possible to track them down more easily if they are problematic.
- If this User Agent is really to be blocked, why do we still provide
the content of the page that is forbidden?
We don't; you get a big fat Wikimedia-customized error page with a generic multilingual message, and this bit somewhere in the middle:
<!-- Technical details of the error; shows all the time, with any language --> <div class="TechnicalStuff"> <bdo dir="ltr"> Request: GET http://en.wikipedia.org/wiki/Foo, from 69.17.48.227 via sq24.wikimedia.org (squid/2.6.STABLE21) to ()<br/> Error: ERR_ACCESS_DENIED, errno [No Error] at Fri, 23 Jan 2009 17:59:46 GMT </bdo> <div id="AdditionalTechnicalStuff"></div> </div>
-- brion