On Fri, Jan 23, 2009 at 7:03 PM, Brion Vibber brion@wikimedia.org wrote:
On 1/23/09 2:36 AM, Andre Engels wrote:
Two questions:
- Why is this User Agent getting this response? If I remember
correctly, this was installed in the early days of the pywikipediabot, when Brion wanted to block it because it had a programming error causing it to fetch each page twice (sometimes even more?). If that is the actual reason, I see no reason why it should still be active years afterward...
This has nothing to do with pywikipediabot.
We too frequently encountered poorly-written bots and site-scrapers which slammed the servers too hard and caused problems. Blocking default UAs of common libraries cut these incidents down dramatically, and helps encourage thoughtful bot writers to put specific information into their user-agent string, making it possible to track them down more easily if they are problematic.
Is there any list of those UAs or UA parts available? I had this problem some time ago with my bot which used a custom UA string and got access denied, so I changed its UA to Firefox as I had no nerves to track down WHICH part of the UA triggered the filter.
Marco