I am writing a Java program to extract the abstract of the wikipedia page
given the title of the wikipedia page. I have done some research and found
out that the abstract with be in rvsection=0
So for example if I want the abstract of 'Eiffel Tower" wiki page then I am
querying using the api in the following way.
and parse the XML data which we get and take the wikitext in the tag <rev
xml:space="preserve"> which represents the abstract of the wikipedia page.
But this wiki text also contains the infobox data which I do not need. I
would like to know if there is anyway in which I can remove the infobox data
and get only the wikitext related to the page's abstract Or if there is any
alternative method by which I can get the abstract of the page directly.
Looking forward to your help.
Thanks in Advance
I have an application that every five minutes or so comes up with a list of
50-500 Wikipedia individual documents that it wants to request via curl. It
wants to get them as quickly and as completely as possible with minimal
missed documents or dropped connections.
1) what is the etiquette for how to do this politely?
2) what are the best practice for using curl to submit API requests?
* is it best to use a single curl command with a long list of URLs to get,
or a long list of individual curl commands?
* compressed or not?
* retry values?
* other parameters?
Subscribe to the Nimble Books Mailing List http://eepurl.com/czS- for
Has there been any movement toward API account creation?  I
maintain several projects that use the API to embed a MediaWiki-type
interface into a CMS. As it stands now, I need to tell my users to go
to the MediaWiki interface to create an account -- a discombobulating
experience for many.
I want to add a widget to a web page that will allow user to enter search
terms and search wikimedia for images that match the terms. I have
implemented a similar widget for flickr, using their API, but am having
trouble doing the same with wikimedia.
Basically, I would like to replicate the functionality of the
commons.wikimedia.org search page. Ideally I would like to be able to get a
Category listing (ex. http://commons.wikimedia.org/wiki/Chartres_Cathedral)
or a true search results (ex.
but at this point I would be happy with either.
I've tried using the allimages list, but that is not adequate. Is there any
other way to search images using the API?
I have also been looking at Freebase and DBPedia. These seem like they
might do what I want, but RDF is completely new to me and I'm still trying
to figure out the basics of it. If anyone can point me in the right
direction for either of those resources, I would appreciate it.
I am using the following perl modules to extract data from Wikipedia and
Wikitravel respectively -
>From both these APIs and also by looking at the MediaWiki APIs, I seem to
get the entire chunk of text in the Web Service response. To extract
different sections of the Wiki entry, I have to rely on pattern matching
and regular expressions.
Is there a better way to achieve this? Is there some sample code in any
language (preferably, perl) which anyone can share, or is there some tool
which does this out of the box?
Any help would be appreciated.