Hi André, Hi Tom,
I did change the user-agent: I knew py-urllib had been
banned but not why. Still, I think unique user-agents
should be used - nothing wrong with a little
accountability - when I spot an obviously faked one I
usually deny it as being up to no good. *#&%@(*!!
My 'robot' hardly deserves the name; it only fetches
and parses the Wikipedia RSS feeds. I'll have a look
at PWB, sounds interesting.
Ken
--- Andre Engels <andreengels(a)gmail.com> wrote:
Another possibility could be that he failed to
change the user-agent,
and the bot uses the default Python user agent,
which is blocked
because the Python Wikipediabot used it in the past,
and at that time
did not throttle and tended to load pages once to
check whether they
existed, and then a second time to read their
content.
I do wonder why we are still blocking it more than a
year afterward...
Andre Engels
On Sat, 23 Oct 2004 14:04:44 -0700, Brion Vibber
<brion(a)pobox.com> wrote:
On Oct 23, 2004, at 8:21 AM, Ken Ara wrote:
> I have been reading the "recent changes" and
"new
> articles" RSS-feeds through My Yahoo.
Now I've
written
> a small Python program to fetch and parse
these
feeds,
> but it fails most of the time with a message
about
> "intermittent server problem" and
warning that
my
> user-agent may be blocked. Since the feeds
are
> available through my browser, I conclude that I
have
been
blocked. (My program sends the user-agent,
'WikiWalker' and provides my email address).
Can you provide a dump of the sent and received
HTTP headers? You might
just be getting timeouts, the servers have had
some ups and downs
lately.
-- brion vibber (brion @
pobox.com)
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)wikimedia.org
_______________________________________________
Wikitech-l mailing list
Wikitech-l(a)wikimedia.org
_______________________________
Do you Yahoo!?
Express yourself with Y! Messenger! Free. Download now.