Hello,
I am writing a Java program to extract the abstract of the wikipedia page
given the title of the wikipedia page. I have done some research and found
out that the abstract with be in rvsection=0
So for example if I want the abstract of 'Eiffel Tower" wiki page then I am
querying using the api in the following way.
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Eiffel…
and parse the XML data which we get and take the wikitext in the tag <rev
xml:space="preserve"> which represents the abstract of the wikipedia page.
But this wiki text also contains the infobox data which I do not need. I
would like to know if there is anyway in which I can remove the infobox data
and get only the wikitext related to the page's abstract Or if there is any
alternative method by which I can get the abstract of the page directly.
Looking forward to your help.
Thanks in Advance
Aditya Uppu
לא יודע
חסום לי
בתאריך יום א׳, 29 בדצמ׳ 2019 ב-14:01 מאת <
mediawiki-api-request@lists.wikimedia.org>:
> Send Mediawiki-api mailing list submissions to
> mediawiki-api(a)lists.wikimedia.org
>
> To subscribe or unsubscribe via the World Wide Web, visit
> https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
> or, via email, send a message with subject or body 'help' to
> mediawiki-api-request(a)lists.wikimedia.org
>
> You can reach the person managing the list at
> mediawiki-api-owner(a)lists.wikimedia.org
>
> When replying, please edit your Subject line so it is more specific
> than "Re: Contents of Mediawiki-api digest..."
>
>
> Today's Topics:
>
> 1. API search issue (Alessandro Vallin)
>
>
> ----------------------------------------------------------------------
>
> Message: 1
> Date: Sun, 29 Dec 2019 11:42:50 +0100
> From: Alessandro Vallin <alessandro.vallin(a)gmail.com>
> To: mediawiki-api(a)lists.wikimedia.org
> Subject: [Mediawiki-api] API search issue
> Message-ID:
> <CAO6zDE6nJkUJShL1S=
> M8Jg2qT-KZUjK3uQ3pVt-5Z-YFVatxxw(a)mail.gmail.com>
> Content-Type: text/plain; charset="utf-8"
>
> Good morning
>
> I am using the following API search, which used to return title, content
> and link of a Wikipedia entry:
>
>
> https://it.wikipedia.org/w/api.php?action=opensearch&search=alessandro%20le…
>
> Just recently I noticed that it is always returning an empty content part
> ([""]):
>
> ["alessandro leogrande",["Alessandro Leogrande"],[""],["
> https://it.wikipedia.org/wiki/Alessandro_Leogrande"]]
>
> Can you please give me any insight?
> Thanks,
> Alessandro
>