MaxSem on IRC gave a solution that may help you.
Using the following call, you can get section titles, numbers and
offsets from the beginning of the page:
https://en.wikipedia.org/w/api.php?action=parse&page=Pittsburgh&pro…
Using the following call, you can get a section's text by its number:
https://en.wikipedia.org/w/api.php?action=parse&page=Pittsburgh&pro…
You can tweak your calls using the API sandbox:
https://en.wikipedia.org/wiki/Special:ApiSandbox
--
Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי
http://aharoni.wordpress.com
“We're living in pieces,
I want to live in peace.” – T. Moore
2012/3/3 Ashish Mukherjee <ashish.mukherjee(a)gmail.com>om>:
Hi,
I am using the following perl modules to extract data from Wikipedia and
Wikitravel respectively -
- WWW::Wikipedia
- MediaWiki::API
From both these APIs and also by looking at the MediaWiki APIs, I seem to
get the entire chunk of text in the Web Service response. To extract
different sections of the Wiki entry, I have to rely on pattern matching and
regular expressions.
Is there a better way to achieve this? Is there some sample code in any
language (preferably, perl) which anyone can share, or is there some tool
which does this out of the box?
Any help would be appreciated.
Regards,
Ashish
_______________________________________________
Mediawiki-api mailing list
Mediawiki-api(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api