I am writing a Java program to extract the abstract of the wikipedia page
given the title of the wikipedia page. I have done some research and found
out that the abstract with be in rvsection=0
So for example if I want the abstract of 'Eiffel Tower" wiki page then I am
querying using the api in the following way.
and parse the XML data which we get and take the wikitext in the tag <rev
xml:space="preserve"> which represents the abstract of the wikipedia page.
But this wiki text also contains the infobox data which I do not need. I
would like to know if there is anyway in which I can remove the infobox data
and get only the wikitext related to the page's abstract Or if there is any
alternative method by which I can get the abstract of the page directly.
Looking forward to your help.
Thanks in Advance
When list=allusers is used with auactiveusers, a property 'recenteditcount'
is returned in the result. In bug 67301 it was pointed out that this
property is including various other logged actions, and so should really be
named something like "recentactions".
Gerrit change 130093, merged today, adds the "recentactions" result
property. "recenteditcount" is also returned for backwards compatability,
but will be removed at some point during the MediaWiki 1.25 development
Any clients using this property should be updated to use the new property
name. The new property will be available on WMF wikis with 1.24wmf12, see
https://www.mediawiki.org/wiki/MediaWiki_1.24/Roadmap for the schedule.
Brad Jorsch (Anomie)
Mediawiki-api-announce mailing list
I found in some of the UPLOAD update, there is no page id:
<rc type="log" ns="6" title="File:Lucian A. Sperta- Nunez.jpg" rcid="
114549183" pageid="0" revid="0" old_revid="0" user="Azarel63"oldlen="0"
newlen="0" timestamp="2014-01-05T11:09:38Z" comment="User created page with
UploadWizard" logid="77242320" logtype="upload"logaction="upload" img_sha1="
<rc type="log" ns="6" title="File:Gingerbread spices (annotated).jpg" rcid="
114549185" pageid="30485540" revid="0" old_revid="0"user="SKopp" oldlen="0"
newlen="0" timestamp="2014-01-05T11:09:37Z" comment="User created page with
UploadWizard" logid="77242318"logtype="upload" logaction="upload" img_sha1="
The first one has no page id but the second one has.
Does anybody can tell me the differences?
I've just submitted Gerrit change 153110 which will overhaul the token
handling in the API, as described on the API Roadmap RFC. The patch is
not merged yet, feel free to join in the code review or reply with
comments. Follow the Gerrit change for any changes to the information
below. A followup to this announcement with deployment dates will be sent
once the change is merged.
For clients, all the old methods of fetching tokens will continue to work
with deprecation warnings. Usage levels of the deprecated methods on
queries to WMF wikis will be evaluated once the MediaWiki 1.25 development
Changes visible to clients include:
* All tokens are available from the new meta=tokens query submodule.
** The "centralauth" token, which was provided by action=tokens but wasn't
really a token in the sense of the rest, is now available from
** Note that it is possible to use meta=tokens along with other query prop,
list, and meta modules.
* The help for all token parameters clearly indicates which type of token
* The output from action=paraminfo includes the token type as a property on
the subobject describing the token parameter.
* All tokens may be cached as long as the session is valid; none are
dependent on factors such as the page being edited or the user being
* Most token types have been replaced with a single 'csrf' token. This has
long been the case in practice (e.g. see ), this just makes it official.
* The tokens returned for action=rollback and action=userrights (and
certain extension modules) are no longer the same tokens used in the
corresponding features in the web UI. The web UI tokens are accepted by the
API for compatibility, but not vice versa.
* Any API query (with a few exceptions, mainly queries to the 'feed'
modules) will return the current timestamp when passed the 'curtimestamp'
parameter. This may be used to fetch the starttimestamp necessary for
For extension authors, if your extension uses the core token handling it
*will* need updating. I've already submitted patches for the 26 extensions
hosted in WMF's Gerrit repository. The necessary changes are:
* needsToken() must return a string or false; true will result in an error.
Unless there are special security issues that require a custom salt, 'csrf'
should be returned.
** Since any truthy string is equivalent to the old behavior of returning
boolean true, this will continue to work with older versions of MediaWiki.
* If a custom salt is needed, the new 'ApiQueryTokensRegisterTypes' hook
must be used to register it.
* If web UI will be using a different salt (e.g. because it's included in
links rather than posted form fields), a method getWebUITokenSalt() may be
overridden to supply this salt for compatibility.
* It is no longer necessary to return data for 'token' from
getAllowedParams() or getParamDescription(). Any return from
getAllowedParams() will be overridden; a string from getParamDescription()
will also be overridden with a standard message, while an array will have
the standard message prepended.
** Compatibility with older versions of MediaWiki may be maintained by
continuing to return data for 'token' from getAllowedParams() and a string
for 'token' from getParamDescription().
* getTokenSalt() is no longer called or defined in ApiBase, and may be
removed once compatibility with older versions of MediaWiki is no longer
Brad Jorsch (Anomie)
Mediawiki-api-announce mailing list
Summing up, it seems like "action API" and "api.php" are the two contenders.
"api.php" is least likely to be confused with anything (only its own entry
point file). But as a name it's somewhat awkward.
"action API" might be confused with the Action class and its subclasses,
although that doesn't seem like a big deal.
As for the rest:
Just "API" is already causing confusion. Although it'll certainly continue
to be used in many contexts.
"MediaWiki API", "Web API", and "MediaWiki web API" are liable to be
confused with the proposed REST API, which is also supposed to be
web-accessible and will theoretically part of MediaWiki (even though I'd
guess it's probably going to be implemented as an -oid). "MediaWiki web
APIs" may well grow to encompass the api.php action API, the REST API, and
maybe even stuff like Parsoid.
"MediaWiki API" and "Core API" are liable to be confused with the various
hooks and PHP classes used by extensions.
"JSON API" wouldn't be accurate for well into the future, and would likely
be confused with other JSON-returning APIs such as Parsoid and maybe REST.
"Classic API" makes it sound like there's a full replacement.
All the code name suggestions would be making things less clear, not more.
If it had started out with a code name there would be historical inertia,
but using a code name now would just be silly.
I'm trying to write some basic test bots before I start writing
patches for the Java Wiki Bot Framework (JWBF). I've been able to get
one to compile, but I have persistent runtime errors.
I've put the relevant information and things I've already tried in
this gist: https://gist.github.com/fhocutt/0221c1a35265ce7cda33
Has anyone had similar problems while getting JWBF set up, or seen
similar behavior from other Java projects?
When I wrote the new API client library standard, one of the intended
effects was that libraries would make it easy for bot-runners to
comply with the user-agent policy found at
https://meta.wikimedia.org/wiki/User-agent_policy . However, different
people understand the policy to mean different things.
As I read it, the relevant parts of the policy are:
"If you run a bot, please send a User-Agent header identifying the bot
and supplying some way of contacting you, e.g.:
`User-Agent: MyCoolTool/1.1 (http://example.com/MyCoolTool/;
There's been some discussion in the context of my client library
evaluation project here:
and here: https://github.com/jpatokal/mediawiki-gateway/issues/65 . As
I understood it, the example provided demonstrated the requirements,
but it's now clear to me that there's room for ambiguity in the
interpretation of the user-agent policy.
My question is: what information is essential to "identify" a bot? The
example given appears to contain the bot name and version, a link to a
page with more information and/or the repository for the bot's code,
and the framework that was used to write the bot (SuperLib/1.4, I
assume). Does WMF operations want all of these components? What is the
minimum necessary to comply with the policy, and what is bonus
UnicornPI (unicorn-pie) sounds good to me ;)
On 8/7/14, Brandon Harris <bharris(a)wikimedia.org> wrote:
> something something Unicorn.
> On Aug 6, 2014, at 2:32 PM, Brad Jorsch (Anomie) <bjorsch(a)wikimedia.org>
>> When api.php was basically the only API in MediaWiki, calling it "the API"
>> worked well. But now we've got a Parsoid API, Gabriel's work on a REST
>> content API, Gabriel's work on an internal storage API, and more on the
>> way. So just saying "the API" is getting confusing.
>> So let's bikeshed a reasonably short name for it that isn't something
>> like "the api.php API". I'm horrible at naming.
>> Brad Jorsch (Anomie)
>> Software Engineer
>> Wikimedia Foundation
>> Wikitech-l mailing list
> Brandon Harris, Senior Designer, Wikimedia Foundation
> Support Free Knowledge: http://wikimediafoundation.org/wiki/Donate
> Wikitech-l mailing list