Hello,
I am writing a Java program to extract the abstract of the wikipedia page
given the title of the wikipedia page. I have done some research and found
out that the abstract with be in rvsection=0
So for example if I want the abstract of 'Eiffel Tower" wiki page then I am
querying using the api in the following way.
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Eiffel…
and parse the XML data which we get and take the wikitext in the tag <rev
xml:space="preserve"> which represents the abstract of the wikipedia page.
But this wiki text also contains the infobox data which I do not need. I
would like to know if there is anyway in which I can remove the infobox data
and get only the wikitext related to the page's abstract Or if there is any
alternative method by which I can get the abstract of the page directly.
Looking forward to your help.
Thanks in Advance
Aditya Uppu
Hey,
This seems to be the homepage for the "REST API":
https://www.mediawiki.org/wiki/Wikimedia_REST_API
but I don't see any documentation there for API parameters.
For example, someone gave me this example API call:
https://en.wikisource.org/api/rest_v1/page/html/Little_Essays_of_Love_and_V…
But I don't see information in the first link above that you should append
"page" and "html" to the wikisource API endpoint.
Is there documentation for this API?
Also, is this API an attempt to replace the old one or something? How does
it differ from the MediaWiki Action API and the MediaWiki REST API?
For example, do they fetch different kinds of information or something?
Also, would the "parse" prop in the Action API be a good way to get
plaintext from a Wikisource book?
Thank you,
Julius
Hey,
Does anyone think it would be possible, or has it been done before, to take
some kind of downloaded dictionary data from a corpus and try to
automatically upload it to Wiktionary?
For example, if you had a very large dictionary over words in a language
and their part of speech, could you check if the word was already present
in Wiktionary and if so skip it, and otherwise create entries for all
remaining words, with their part of speech?
Thank you,
Julius
Hey,
It seems relatively easy to export any section of a Wikisource book with
the "download" button, but there is no way to export just the preface /
title page. If you press "download" on the title page, the section name is
just the book name, so you download the entire book instead of just the
preface.
For example: https://en.wikisource.org/wiki/Little_Essays_of_Love_and_Virtue
Is this intentional? Is there any good way to export the plaintext of just
the preface, via some command / functionality / API?
Thank you,
Julius