If that's so, then you should be able to sort the problem by using this form of the link:
http://en.wikipedia.org/wiki/Special:Recentchanges?feed=rss
WT
-----Original Message----- From: Timwi [mailto:timwi@gmx.net] Sent: 16 September 2005 2:47 PM To: wikien-l@wikipedia.org Cc: wikitech-l@wikimedia.org Subject: [WikiEN-l] Re: RSS feed not working for Google Personal
Phil Boswell wrote:
The response was as follows: Your search http://en.wikipedia.org/w/index.php?title=Special:Recentchanges&feed=rss was blocked by that feed's robots.txt.
What's going on?
It means that Wikimedia have kindly asked Google not to access certain parts of the site (namely everything in /w/) and that Google is kind enough to abide by it.
Timwi
WikiEN-l mailing list WikiEN-l@Wikipedia.org To unsubscribe from this mailing list, visit: http://mail.wikipedia.org/mailman/listinfo/wikien-l
"Worldtraveller" wikipedia@world-traveller.org wrote in message news:57472.62.25.109.196.1126879570.squirrel@62.25.109.196...
From: Timwi [mailto:timwi@gmx.net] Sent: 16 September 2005 2:47 PM
Phil Boswell wrote:
The response was as follows: Your search http://en.wikipedia.org/w/index.php?title=Special:Recentchanges&feed=rss was blocked by that feed's robots.txt. What's going on?
It means that Wikimedia have kindly asked Google not to access certain parts of the site (namely everything in /w/) and that Google is kind enough to abide by it.
If that's so, then you should be able to sort the problem by using this form of the link: http://en.wikipedia.org/wiki/Special:Recentchanges?feed=rss
You beauty, that works a treat. Which would likely explain all the hassle I've been having trying to get the various Wikicities feeds into the same page.
Would it be really stupid to ask why the URL sin't presented in this form on the actual page?
On 9/16/05, Phil Boswell phil.boswell@gmail.com wrote:
Would it be really stupid to ask why the URL sin't presented in this form on the actual page?
I'm sure Tim or Brion know the answer to this, but I'd hazard a guess that the /w form bypasses the squid caches.
In general, reader activities have the /wiki form of the URL, and in those cases it's okay if the reader sees a cached squid proxy copy of the page, whereas editing activities (edit, history, what links here, and special pages) tend to have the /w form of the URL. In those cases the request would normally be performed by a script in realtime, and the squid proxies aren't running PHP code and have no database access so they can't do that.
The squids are basically dumb mirrors that sit in between the reader and the Wikipedia system and only fetch a page from the system if they have no up-to-date copy in their cache.
On 16/09/05, Tony Sidaway f.crdfa@gmail.com wrote:
On 9/16/05, Phil Boswell phil.boswell@gmail.com wrote:
Would it be really stupid to ask why the URL sin't presented in this form on the actual page?
I'm sure Tim or Brion know the answer to this, but I'd hazard a guess that the /w form bypasses the squid caches.
Actually, I think one of the prime reasons is precisely so that they can be excluded in the robots.txt (it seems odd to me that an RSS aggregator would read robots.txt in that way, but never mind) - any URL with extra parameters is likely to be a view not suitable for crawlers, being either highly transitory or a different view of something already crawled, so those URLs aren't transformed into the /wiki/ form. I could be wrong about this, though.