people biographies pulled in via RSS - Wikitech-l - lists.wikimedia.org

List overview All Threads
Download

people biographies pulled in via RSS

Switch on wikien-l moderation...

phpbb logged in tag

QuotationsBook.com Webmaster/Support

2 Jun 2005 2 Jun '05

10:07 p.m.

Hello I'm constructing a large literary resource, and would like to query articles about authors automatically, and receive Wikipedia articles back as RSS/XML documents to present in my website, with all the relevant backlinks to wikipedia. Does Wikipedia have customised syndication of its content? Regards Amit quotationsbook.com

Reply

Show replies by date

Rowan Collins

3 Jun 3 Jun

3:43 p.m.

On 02/06/05, QuotationsBook.com Webmaster/Support

Does Wikipedia have customised syndication of its content?

* Well, in terms of literal feeds, there are a few things listed at http://en.wikipedia.org/wiki/Wikipedia:XML_feeds * There are also a handful of bugs open requesting new feeds - http://bugzilla.wikimedia.org/buglist.cgi?bug_id=471%2C472%2C943 * Article content can also be exported wrapped in a custom slice of XML using the [[Special:Export]] page - see http://meta.wikimedia.org/wiki/Help:Export_and_import * And finally, of course, there are the database dumps for setting up your own mirror and mucking about with it, at http://download.wikimedia.org Hope something here goes at least some way to answering your query. Happy Hacking. -- Rowan Collins BSc [IMSoP]

Reply

Jerome Jamnicky

5 Jun 5 Jun

5:57 p.m.

QuotationsBook.com Webmaster/Support wrote:

Hello I'm constructing a large literary resource, and would like to query articles about authors automatically, and receive Wikipedia articles back as RSS/XML documents to present in my website, with all the relevant backlinks to wikipedia. Does Wikipedia have customised syndication of its content?

If you do some sort of [semi-]dynamic inclusion of Wikipedia content into your site, please adhere to the following: 1) Take measures to ensure that your site does not violate robots.txt. In case it is not apparent, this includes ensuring that bots, which access your site, do not indirectly violate Wikipedia's robots.txt when your site is acting as an intermediary. 2) Have your software clearly identify itself to Wikipedia, on each request, with a user-agent string. There are some broad user-agent blocks in place to combat the frequent abuse of the site by bots using certain software/libraries. It's acceptable to alter the user-agent of your software to get past these blocks, as long as you make sure that it is well-behaved. 3) Know what your software is doing. Monitor the behaviour of your site to ensure that it doesn't load Wikipedia's servers with rubbish such as that you can see at the following URL: http://noc.wikimedia.org/~jeronim/mysic.txt I'd also suggest that you list your site on the relevant subpage of http://en.wikipedia.org/wiki/Wikipedia:Mirrors_and_forks and describe its interaction with Wikipedia. This will help to conserve the time of the volunteers who monitor sites which use Wikipedia content, and hopefully prevent any misunderstandings.

Regards Amit quotationsbook.com _______________________________________________ Wikitech-l mailing list Wikitech-l(a)wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l

Send instant messages to your online friends http://au.messenger.yahoo.com

Reply

6948

days inactive

6951

days old

wikitech-l@lists.wikimedia.org

Manage subscription

2 comments

3 participants

tags (0)

participants (3)

Jerome Jamnicky
QuotationsBook.com Webmaster/Support
Rowan Collins