Hello,
I am writing a Java program to extract the abstract of the wikipedia page
given the title of the wikipedia page. I have done some research and found
out that the abstract with be in rvsection=0
So for example if I want the abstract of 'Eiffel Tower" wiki page then I am
querying using the api in the following way.
http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Eiffel…
and parse the XML data which we get and take the wikitext in the tag <rev
xml:space="preserve"> which represents the abstract of the wikipedia page.
But this wiki text also contains the infobox data which I do not need. I
would like to know if there is anyway in which I can remove the infobox data
and get only the wikitext related to the page's abstract Or if there is any
alternative method by which I can get the abstract of the page directly.
Looking forward to your help.
Thanks in Advance
Aditya Uppu
When list=allusers is used with auactiveusers, a property 'recenteditcount'
is returned in the result. In bug 67301[1] it was pointed out that this
property is including various other logged actions, and so should really be
named something like "recentactions".
Gerrit change 130093,[2] merged today, adds the "recentactions" result
property. "recenteditcount" is also returned for backwards compatability,
but will be removed at some point during the MediaWiki 1.25 development
cycle.
Any clients using this property should be updated to use the new property
name. The new property will be available on WMF wikis with 1.24wmf12, see
https://www.mediawiki.org/wiki/MediaWiki_1.24/Roadmap for the schedule.
[1]: https://bugzilla.wikimedia.org/show_bug.cgi?id=67301
[2]: https://gerrit.wikimedia.org/r/#/c/130093/
--
Brad Jorsch (Anomie)
Software Engineer
Wikimedia Foundation
_______________________________________________
Mediawiki-api-announce mailing list
Mediawiki-api-announce(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce
There's an inconsistency in the handling of multi-value parameters where if
you pass a value consisting of only whitespace[1] it will be interpreted as
an empty set rather than as a single value consisting of whitespace. For
example, https://www.mediawiki.org/w/api.php?action=query&titles= correctly
handles the query as specifying no titles, but
https://www.mediawiki.org/w/api.php?action=query&titles=%20 is also treated
as specifying no titles rather than specifying a title consisting of a
space character.
With Gerrit change 405609,[2] the behavior will be changed so that the
latter query will behave more like
https://www.mediawiki.org/w/api.php?action=query&titles=_ in reporting that
the supplied title is invalid. The plan is that this will be deployed to
WMF wikis with 1.31.0-wmf.20 on February 6–8, see
https://www.mediawiki.org/wiki/MediaWiki_1.31/Roadmap for the schedule.
Passing an empty string as the value of a multi-valued parameter will
continue to be treated as a set of zero elements rather than a one-element
set containing the empty string.
Most clients should handle this change without major issue, as the
resulting response is consistent with the response when any other invalid
title is submitted. Clients that want to continue treating whitespace-only
input as no input should begin checking for whitespace-only input before
submitting it to the API.
[1]: Specifically: spaces (U+0020), tabs (U+0009), line feeds (U+000A), and
carriage returns (U+000D).
[2]: https://gerrit.wikimedia.org/r/#/c/405609/
--
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation
_______________________________________________
Mediawiki-api-announce mailing list
Mediawiki-api-announce(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce
The tags query module has a tgprop parameter to specify which properties of
the tag should be returned. One of these properties was 'name', but the
name was always included in the response regardless of whether this
property was included in tgprop.
Since most uses aren't specifying 'name' but we can assume clients are
depending on the name being included (since there's otherwise no way to
identify which tag is which), we've removed the nonfunctional 'name' as a
valid option for tgprop. This will result in a warning that 'name' is not a
valid value for tgprop for those few clients that are specifying 'name'
there.
This change will not affect the functionality of any clients unless the new
warning somehow breaks them. It should be deployed to WMF wikis with
1.31.0-wmf.18 or later. Clients can safely stop specifying 'name' in tgprop
immediately, though, since it doesn't do anything.
For further information, see https://phabricator.wikimedia.org/T185058.
--
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation
_______________________________________________
Mediawiki-api-announce mailing list
Mediawiki-api-announce(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce
Hello everyone,
I am trying to use the MediaWiki API to create a dictionary based on
categories or lists on Wikipedia. I would like to be able to select a
category, or perhaps a list page, and get all members of that list.
I've done some reading of the API, and implemented a prototype. It works a
little bit but only when the data is structured just perfectly for my
purposes. For example, I can easily get a list of all of the
English-language films. I'm using the action=query and list=categorymembers
for this. I end up with 500 films at a time, and I can continue as needed
to get all 60k or so. This is because there is a category that is tagged to
each English-language film's individual page.
On the other hand, if I want to get a list of all National Hockey League
(NHL) players, this is a lot more difficult. The category "Category:Lists
of National Hockey League players" exists, but it's a category of lists of
players. Much of the categorization of Wikipedia turns out to be in lists,
not categories. I could write a webscrapper for this but that would
probably be very unreliable.
Is there a standardized way to deal with lists and sublists that I might
have missed? I don't mind write a bunch of code to recursively crawl
sublists and expand them. But I would like to avoid something as
not-standard as web scrapping the content because it will be very fragile.
Thank you for the help,
-mike
Hello!
I hope this is the correct place to ask this question.
My name is Tal, and I'm developing my own version of the Wiki Game
<https://en.wikipedia.org/wiki/Wikipedia:Wiki_Game> (or Wiki Race, some may
say) for Android devices. There is a great multiplayer version
<https://thewikigame.com/>of it for the iPhone and Web, but I think there
should be a worthy version to the Android.
My version is still in progress, and I put a lot of effort into it,
including the gameplay experience (single-player and multiplayer) and the
design.
_________________________
I'm writing this email because I'd like to consult with some wiki experts
about one of the main things which on my mind, and it's my usage of the
Mediawiki API. I'd like to launch the game and upload it to the Play Store
in a few months, but I can't keep working on it while knowing that I might
put my efforts in danger if I will do so, because of unfair usage of the
Wiki API.
Well, the game is based on generating two random articles: the first
article (the starting point) and the target article (the end point) which
the user should reach to using only the hyperlinks inside the articles.
The thing is, that unlike the currently exist Wiki Game
<https://thewikigame.com/> version which is played by many users together
at the same time (by joining "game rooms"), or using pre-generated
articles, - I'm generating the articles live for each user individually.
Each user can play in single-player mode and generate his own pair of
articles, his own game, a thing which is gonna keep the game infinite for
all of the players.
My generator generates two articles before each game and the user has 15
seconds to decide whether he would like to start the game or not. After the
15 seconds are up the generator is generating a new pair of articles. While
the time is running, the user may refresh the articles manually and ask to
generate a new pair.
So why am I worried about my app's usage of the API?
Well, not all of the target articles are reachable through the hyperlinks
in the mobile version of Wikipedia, because the templates view in the
mobile version is very limited and there are articles without references
from other articles, and I don't want the users to get stuck.
In order to overcome this , and make sure that the target article is
actually reachable, I need to make about 5 requests to the REST API each
articles generation (one request after another, not together). The
algorithm was made after a very long time of research by me, and I think
its efficient for the purpose.
So, I assume that the average player is gonna play about an hour in a day,
so he is gonna refresh his pair about 15 times, which are 75 requests from
the API in a given day for a user. Let's say (hopefully and hypothetically)
the game goes pretty viral - well, there are gonna be a lot of requests to
the API at the same time, and this makes me unsure about my way of action.
I don't mind sending my random articles generator code (as long it is not
copied :P ) for a review, and work together with somebody to make the game
better and fair, and I'm also willing to donate a part of the game incomes
(if there will be any) to the Wikimedia Foundation, but there is one thing
I'm not gonna do, and it's launching the game before I'm making sure I do a
fair use of the API. I know Wikipedia is taking care millions and billions
of requests everyday, and still - I can't move on without knowing
everything is fine.
This information <https://www.mediawiki.org/wiki/API:Etiquette> about the
usage limits is not very explicit, and it doesn't put the line between good
and bad, and maybe I'm using it badly. I will accept every answer
respectfully.
Thanks in advance,
Tal