Hello,
I am writing a Java program to extract the abstract of the wikipedia page
given the title of the wikipedia page. I have done some research and found
out that the abstract with be in rvsection=0
So for example if I want the abstract of 'Eiffel Tower" wiki page then I am
querying using the api in the following way.
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Eiffel…
and parse the XML data which we get and take the wikitext in the tag <rev
xml:space="preserve"> which represents the abstract of the wikipedia page.
But this wiki text also contains the infobox data which I do not need. I
would like to know if there is anyway in which I can remove the infobox data
and get only the wikitext related to the page's abstract Or if there is any
alternative method by which I can get the abstract of the page directly.
Looking forward to your help.
Thanks in Advance
Aditya Uppu
I'm looking into the Wikiloop Battlefield data
http://battlefield.wikiloop.org/feed/mix and trying to query the data via
the API using revision id.
My attempts appear to return with "badrevids".
Am i using the API incorrectly?
Here is my sample python code...
import requests
S = requests.Session()
URL = "https://www.mediawiki.org/w/api.php"
PARAMS = {
"action": "query",
"prop": "revisions",
"revids": 961090023,
"rvprop": "ids|timestamp|user|comment|content",
"rvslots": "main",
"formatversion": "2",
"format": "json"
}
R = S.get(url=URL, params=PARAMS)
DATA = R.json()
PAGES = DATA["query"]
for page in PAGES:
print(page)
This message is a notice about forthcoming removal of a response field in
the REST API’s /page/summary endpoint. [1] This message was posted to
wikitech-l and is being cross-posted on mediawiki-api-announce.
The endpoint is used in the Page Previews (“hovercards”) functionality on
the classic web (desktop) and Android & iOS experiences for Wikipedia, in
addition to numerous external experiences. We don't anticipate impacts on
the mainline Wikipedia experiences from this forthcoming field removal, as
the endpoint will still be operational.
The REST API's /page/summary endpoint provides an api_urls field in its
response with links to several other REST API endpoints supported by the
Page Content Service.
Three of the api_urls subfields refer to experimental endpoints that have
been removed from the REST API altogether in favor of newer API endpoints.
1. media (/page/media)
2. references (/page/references)
3. metadata (/page/metadata).
api_urls also contains subfields referring to stable endpoints that
continue to exist in the REST API:
1. summary (which is self referential)
2. edit_html (/page/html - the Parsoid HTML)
3. talk_html (/page/html/Talk:<title> - the Parsoid HTML for the
corresponding Talk page)
Wikimedia’s Product Infrastructure team intends to remove the api_urls
field of the /page/summary response.
A cursory review at https://codesearch.wmflabs.org/ and Wikimedia Git
mirrors suggests api_urls isn’t in use in consuming code.
Review of web logs suggests traffic for the following endpoints:
- The media endpoint is at about 5% of its original traffic before its
decommission and it appears to be from old Wikipedia for Android clients.
This is expected.
- The references endpoint’s Wikipedia for Android traffic does not seem
present and its other traffic appears to be from non-user application
software based on User-Agent header components.
-The metadata endpoint’s traffic seems to have all but stopped.
This change is being announced in advance of the change because the
endpoint is advertised as stable. [2]
Please update your clients if you rely on the presence of the api_urls
field. If this change poses a problem for your clients, please do let us
know as soon as possible at the tracking task:
https://phabricator.wikimedia.org/T247991
We plan to remove the api_urls field described here on or after 14 July
2020.
Thank you.
Adam Baso
Director of Engineering
Wikimedia Foundation
[1]
https://en.wikipedia.org/api/rest_v1/#/Page%20content/get_page_summary__tit…
[2] Refer to
https://www.mediawiki.org/wiki/API_versioning#End_point_stability and
https://www.mediawiki.org/wiki/Wikimedia_Product/Wikimedia_Product_Infrastr…
for more information on stability designations.
_______________________________________________
Mediawiki-api-announce mailing list
Mediawiki-api-announce(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce
Hello,
I'm trying to add Meta as a wiki in my tool, WPCleaner.
Meta has differences with the usual languages wikis, like the fact that a
page can have translations. Such translations are handled by
Extension:Translate, which prevents updating the translated pages through
API:Edit (getting error tpt-target-page when trying to update such page).
How can I know before using API:Edit that it will not work?
If possible, when retrieving basic information about the page (like
API:Query:Info).
My need is to skip such pages (avoid getting their content and working on
them) for the moment.
Nico
Hi,
I'm having some trouble when I try to fetch data for swimmers.
When I do allpages request (
https://wiki.swimrankings.net/api.php?action=query&list=allpages&format=json)
I get only some 4 non-important pages
There are no pages like rankingDetail that could contain the data. I can't
even get those pages to see if data can be extracted from them.
If you look, for example, parse request for Main_Page:
https://wiki.swimrankings.net/api.php?action=parse&page=Main_Page&prop=text…
you can see that it returns HTML content
There are no other pages except these 4 pages above so I cannot fetch any
swimmer's data.
Can I somehow get data for individual swimmers... ?
Vladimir