Hello,
I am writing a Java program to extract the abstract of the wikipedia page
given the title of the wikipedia page. I have done some research and found
out that the abstract with be in rvsection=0
So for example if I want the abstract of 'Eiffel Tower" wiki page then I am
querying using the api in the following way.
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Eiffel…
and parse the XML data which we get and take the wikitext in the tag <rev
xml:space="preserve"> which represents the abstract of the wikipedia page.
But this wiki text also contains the infobox data which I do not need. I
would like to know if there is anyway in which I can remove the infobox data
and get only the wikitext related to the page's abstract Or if there is any
alternative method by which I can get the abstract of the page directly.
Looking forward to your help.
Thanks in Advance
Aditya Uppu
When list=allusers is used with auactiveusers, a property 'recenteditcount'
is returned in the result. In bug 67301[1] it was pointed out that this
property is including various other logged actions, and so should really be
named something like "recentactions".
Gerrit change 130093,[2] merged today, adds the "recentactions" result
property. "recenteditcount" is also returned for backwards compatability,
but will be removed at some point during the MediaWiki 1.25 development
cycle.
Any clients using this property should be updated to use the new property
name. The new property will be available on WMF wikis with 1.24wmf12, see
https://www.mediawiki.org/wiki/MediaWiki_1.24/Roadmap for the schedule.
[1]: https://bugzilla.wikimedia.org/show_bug.cgi?id=67301
[2]: https://gerrit.wikimedia.org/r/#/c/130093/
--
Brad Jorsch (Anomie)
Software Engineer
Wikimedia Foundation
_______________________________________________
Mediawiki-api-announce mailing list
Mediawiki-api-announce(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce
In preparation for multi-content revisions (MCR), we've made[1] several
changes to action=compare. These changes should be deployed to Wikimedia
wikis with 1.32.0-wmf.19 or later. The changes should also be available on
the Beta Cluster[2] soon for testing.
*== Supplying content using templated parameters ==*
For MCR, when specifying content (as with the `fromtext` and `totext`
parameters) we need the ability to specify content for each "slot" in the
page. The way this works for action=compare is that (1) the base revision
is determined using the parameters that identify the page and/or revision
(`fromtitle`/`totitle`, `fromrev`/`torev`, and so on), then (2) the new
`fromslots`/`toslots` parameter specifies which slots are being changed,
and then (3) new parameters for each value of `fromslots`/`toslots` specify
the content for each of those slots.
In the API help, these new parameters for each value of
`fromslots`/`toslots` are described as "templated parameters" and have a
placeholder in their names. Where the help describes "totext-{slot}", it's
meaning that if you supply "toslots=foo|bar" then there would be
corresponding parameters "totext-foo" and "totext-bar" to supply the text
for those two slots.
In Special:ApiSandbox, input fields for "totext-foo" and "totext-bar" will
appear when you enter those value for "toslots".
In the future templated parameters will be introduced for action=edit and
action=parse as well, and other modules as the need arises.
*== Deprecations and changes in action=compare ==*
The following parameters are deprecated, with replacements as indicated.
- `fromtext` is replaced with `fromtext-main` with `fromslots=main`.
- `fromcontentmodel` is replaced with `fromcontentmodel-main` with
`fromslots=main`.
- .`fromcontentformat` is replaced with `fromcontentformat-main` with
`fromslots=main`.
- `totext` is replaced with `totext-main` with `toslots=main`.
- `tocontentmodel` is replaced with `tocontentmodel-main` with
`toslots=main`.
- .`tocontentformat` is replaced with `tocontentformat-main` with
`toslots=main`.
The `fromsection` and `tosection` parameters are also deprecated with no
direct replacement. The intended use case for these parameters was to
simulate a diff of a section edit, by supplying the edited section's text
as `totext` and supplying `fromsection` to extract just the section being
edited from the current revision. This use case is now supported by
specifying `totext-main` as the edited section's text and supplying
`tosection-main` to identify the section being edited, which will be
combined into the existing content as for a section edit. This will result
in a diff more closely matching that returned for a section edit from the
web UI with respect to line numbers and context lines.
By default action=compare will return one HTML blob combining the diffs of
all slots, much as is shown in the web UI. The new `slots` parameter may be
used to get separate HTML blobs for each slot's diff and to limit which
slots' diffs are returned..
*== Other notes ==*
Note that the already-deprecated[3] diffing parameters to revision-related
modules, such as the rvdifftotext parameter to action=query&prop=revisions,
will not be updated for MCR. Code using these parameters should be updated
to use action=compare instead.
[1]: https://gerrit.wikimedia.org/r/c/mediawiki/core/+/448160
[2]: e.g. https://en.wikipedia.beta.wmflabs.org/w/api.php?modules=compare
[3]:
https://lists.wikimedia.org/pipermail/mediawiki-api-announce/2017-June/0001…
--
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation
_______________________________________________
Mediawiki-api-announce mailing list
Mediawiki-api-announce(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce
Thanks, Brad. Yes, I was being lazy since the token is always the same.
Matthew
|| Matthew Cahn | Linux Administrator | Dept. of Molecular Biology / Research Computing | Princeton University | (609) 258-5404 | mcahn(a)princeton.edu<mailto:mcahn@princeton.edu> ||
Oh, of course, POST, thanks. Now it works, after also removing the “r” (raw string) from the token since it already has an escaped backslash, and removing urllib.parse.urlencode from the parameters. Here’s the working version in case anyone would like to see it:
#!/bin/env python
import requests
baseUrl = 'http://chlamyannotations-test2.princeton.edu/api.php'
params = {'action': 'query',
'meta': 'tokens'}
responseFilename = '/molbio2/mcahn/temp/createPagesResponse.html'
r = requests.get(baseUrl, params=params)
print(r)
print(r.text)
params = {'action': 'edit',
'title': 'TestPage3',
'summary': 'Test summary',
'text': 'article content',
'token': '+\\'}
f = open(responseFilename, 'w')
r = requests.post(baseUrl, data=params)
print(r)
f.write(r.text)
f.close()
|| Matthew Cahn | Linux Administrator | Dept. of Molecular Biology / Research Computing | Princeton University | (609) 258-5404 | mcahn(a)princeton.edu<mailto:mcahn@princeton.edu> ||
Hi, I’m trying to create a page programmatically. I can get a token back from a GET, but doing an action=edit in a PUT does not create the page. The response I get back is the MediaWiki API help page.
My Python code is below. Can anyone see what I’m doing wrong?
Thanks,
Matthew
#!/bin/env python
import requests
import urllib
import urllib.parse
baseUrl = 'http://chlamyannotations-test2.princeton.edu/api.php'
params = {'action': 'query',
'meta': 'tokens'}
responseFilename = '/molbio2/mcahn/temp/createPagesResponse.html'
r = requests.get(baseUrl, params=params)
print(r)
print(r.text)
params = {'action': 'edit',
'title': 'TestPage',
'summary': 'Test summary',
'text': 'article content',
'token': r'+\\'}
f = open(responseFilename, 'w')
r = requests.put(baseUrl, data=urllib.parse.urlencode(params))
print(r)
f.write(r.text)
f.close()
|| Matthew Cahn | Linux Administrator | Dept. of Molecular Biology / Research Computing | Princeton University | (609) 258-5404 | mcahn(a)princeton.edu<mailto:mcahn@princeton.edu> ||
The Multi-Content Revisions project [1] introduced the concept of slots to
MediaWiki: instead of a single content, a revision can now contain multiple
content slots, each identified by a role name. With Gerrit change 413223
[2], the revision-related query API modules (action=query&prop=revisions,
action=query&prop=deletedrevisions, action=query&list=allrevisions, and
action=query&list=alldeletedrevisions) are updated to account for that:
* They take a new '<prefix>slots' parameter (where <prefix> is one of 'rv',
'drv', 'arv', 'adr') to indicate which roles to return information about.
Use '*' to return information about all slots, 'main' to return information
about the main slot (which is what was called the content of the revision
before multi-content revisions were introduced). When not used, it will
default to 'main' and the legacy response format will be used.
* Their '<prefix>prop' parameter takes a new value, 'roles', to list roles
for which a slot exists in the given revision.
* Using '<prefix>prop=content' or '<prefix>prop=contentmodel' without
specifiying '<prefix>slots' has been deprecated.
* The parameter '<prefix>contentformat' has been deprecated. Clients should
be prepared to handle the default format.
* Using '<prefix>slots' together with the (already deprecated) parameters
'<prefix>expandtemplates', '<prefix>generatexml', '<prefix>parse',
'<prefix>diffto', '<prefix>difftotext', '<prefix>difftotextpst',
'<prefix>contentformat' and '<prefix>prop=parsetree' results in an error.
Multi-Content Revisions is a work in progress; at this point in time, you
are unlikely to find a wiki with pages which have roles other than 'main'.
This will change in the next few months.
[1] https://www.mediawiki.org/wiki/Multi-Content_Revisions
[2] https://gerrit.wikimedia.org/r/c/mediawiki/core/+/413223
--
Gergő Tisza (Tgr)
Senior Software Engineer
Wikimedia Foundation
_______________________________________________
Mediawiki-api-announce mailing list
Mediawiki-api-announce(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce