Dirk Riehle schreef:
Hi,
looking through the API:
http://en.wikipedia.org/w/api.php I can't find
any way to get at the actual page contents. Is this correct?
Like someone else said before me, you can get unparsed wikitext through
prop=revisions&rvprop=content . If you want HTML without the sidebar and
all that, you can get it with
index.php?action=render&title=Foo
Finally, and that's why I'm sending this email
to this mailing list: How
does Powerset do this: Go to
powerset.com, search for something you
might find in Wikipedia, and see how it provides an uptodate
(click-through) copy of the Wikipedia page. My hunch is that the they
use a database dump for search and then screen-scrape, or is there a
better explanation?
You can actually search Wikipedia through the API:
http://en.wikipedia.org/w/api.php?action=query&list=search&srsearch…
(gets a list of 5 pages containing 'Foo')
http://en.wikipedia.org/w/api.php?action=query&generator=search&gsr…
(gets the contents of those pages)
You can also get search suggestions in the OpenSearch format with
http://en.wikipedia.org/w/api.php?action=opensearch&search=Te
Roan Kattouw (Catrope)