Mediawiki-api December 2016

mediawiki-api@lists.wikimedia.org

7 participants
9 discussions

Need to extract abstract of a wikipedia page
by aditya srinivas 23 Nov '23

23 Nov '23

Hello, I am writing a Java program to extract the abstract of the wikipedia page given the title of the wikipedia page. I have done some research and found out that the abstract with be in rvsection=0 So for example if I want the abstract of 'Eiffel Tower" wiki page then I am querying using the api in the following way. http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Eiffel… and parse the XML data which we get and take the wikitext in the tag <rev xml:space="preserve"> which represents the abstract of the wikipedia page. But this wiki text also contains the infobox data which I do not need. I would like to know if there is anyway in which I can remove the infobox data and get only the wikitext related to the page's abstract Or if there is any alternative method by which I can get the abstract of the page directly. Looking forward to your help. Thanks in Advance Aditya Uppu

4 3

[Mediawiki-api-announce] Deprecation of list=allusers 'recenteditcount' result property
by Brad Jorsch (Anomie) 16 Apr '19

16 Apr '19

When list=allusers is used with auactiveusers, a property 'recenteditcount' is returned in the result. In bug 67301[1] it was pointed out that this property is including various other logged actions, and so should really be named something like "recentactions". Gerrit change 130093,[2] merged today, adds the "recentactions" result property. "recenteditcount" is also returned for backwards compatability, but will be removed at some point during the MediaWiki 1.25 development cycle. Any clients using this property should be updated to use the new property name. The new property will be available on WMF wikis with 1.24wmf12, see https://www.mediawiki.org/wiki/MediaWiki_1.24/Roadmap for the schedule. [1]: https://bugzilla.wikimedia.org/show_bug.cgi?id=67301 [2]: https://gerrit.wikimedia.org/r/#/c/130093/ -- Brad Jorsch (Anomie) Software Engineer Wikimedia Foundation _______________________________________________ Mediawiki-api-announce mailing list Mediawiki-api-announce(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce

1 1

How to get full list of references for a Wiki revision through the API?
by Bertel Teilfeldt Hansen 09 Jan '17

09 Jan '17

Hi Mediawiki-api mailing listers! I'm trying to extract the full list of references for different revisions of Wikipedia pages. This seemed like it would be easy enough with "prop=references", but I keep getting the following error: "error": { "code": "citestoragedisabled", "info": "Cite extension reference storage is not enabled.", "*": "See https://en.wikipedia.org/w/api.php for API usage." } Even the example on the API's documentation page returns this error (https://en.wikipedia.org/w/api.php?action=help&modules=query%2Breferences). Why does this occur? And is there no way of getting references through the API? Thanks! Best wishes, Bertel

4 9

[Mediawiki-api-announce] DEPRECATION: Use of action=purge via GET is deprecated
by Brad Jorsch (Anomie) 29 Dec '16

29 Dec '16

To operate correctly, action=purge needs to write to the database, which means it should be done using a POST rather than a GET request. As of Gerrit change 310560,[1] action=purge will begin emitting a warning when used via GET. This should be deployed to WMF wikis with 1.28.0-wmf.20, see https://www.mediawiki.org/wiki/MediaWiki_1.28/Roadmap for the schedule. Clients that use action=paraminfo to determine whether to use GET or POST for an action should automatically switch to POST; any others should manually switch to using POST for this action as soon as possible. To check if your client's user agent is detected making such submissions, you can also use ApiFeatureUsage[2] and look for 'purge-via-GET' once 1.28.0-wmf.20 is rolled out to wikis your client is using. It is planned that this warning will be changed to an error during 1.29. Let's avoid having a repeat of T142155,[3] update your code ASAP instead of waiting until it breaks. Thanks. [1]: https://gerrit.wikimedia.org/r/#/c/310560/ [2]: https://meta.wikimedia.org/wiki/Special:ApiFeatureUsage [3]: https://phabricator.wikimedia.org/T142155 -- Brad Jorsch (Anomie) Senior Software Engineer Wikimedia Foundation _______________________________________________ Mediawiki-api-announce mailing list Mediawiki-api-announce(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce

1 2

Re: [Mediawiki-api] Access to MediaWiki API with 900 RPS
by Lee-Wei Mar 27 Dec '16

27 Dec '16

Resend this mail to mediawiki-api(a)lists.wikimedia.org Hi wikimedia team, This is Lee-Wei from Yahoo. Thanks a lot to Erik for the traffic numbers and Max for the response time estimation. Currently, we will only use the opensearch API in our backend (with cache). Here are some detail traffic numbers. - expected RPS to opensearch: peak 900 RPS (we will adopt cache so if the cache layer works fine, the average RPS would be 15) We plan to release this feature on Jan. 5th, 2017. Please do let us know if you have any question/concern about it. Thanks again for the team's kindly help and support. Cheers, Lee-Wei On Tuesday, November 15, 2016 10:07 AM, Erik Bernhardson <ebernhardson(a)wikimedia.org> wrote: (cc'ing the discovery mailing list, as that team owns both the implementation and operation of search.) I can partially answer this as one of the people responsible for search, but I have to defer to others about API, bots, and such. This would be a noticeable portion of our traffic, for reference: action=opensearch (and generator variants): 1.5k RPS action=query&list=search (and generator variants): 600 RPSall api: 8k RPS (might be a bit higher, this is averaged over an hour) opensearch is relatively cheap, the p95 to our search servers is ~30ms, with p50 at 7ms. So 600 RPS of opensearch traffic wouldn't be too hard on our search cluster. Using action=query is going to be too heavy, the full text searches are computationally more expensive to serve. Might I ask, which wiki(s) would you be querying against? opensearch traffic is spread across our search cluster, but individual wikis only hit portions of it. For example opensearch on en.wikipedia.org is served by ~40% of the cluster, but zh.wikipedia.org (chinese) is only served by ~13%. If you are going to send heavy traffic to zh I might need to adjust those numbers to spread the load to more servers (easy enough, just need to know). Additionally, you mentioned descriptions and keywords. These would not be provided directly by the opensearch api so you might be thinking of using the generator version of it (action=query&generator=prefixsearch) to get the results augmented (ex: /w/api.php?action=query&format=json&prop=extracts&generator=prefixsearch&exlimit=5&exintro=1&explaintext=1&gpssearch=yah&gpslimit=5). I'm not personally sure how expensive that is, someone else would have to chime in. So, from a computational point of view and only with respect to the search portion of our cluster, this seems plausible as long as we coordinate so that we know the traffic is coming. Others will have to chime in about the wider picture. Erik B. On Mon, Nov 14, 2016 at 4:40 PM, Eric Kuo <erickuo(a)yahoo-inc.com> wrote: Hi, This is Eric from Yahoo. My team develops mobile apps for Taiwan and Hong Kong users. We want to provide wiki description on keywords in our contents, and we consider using MediaWiki API:OpenSearch and/or API:Query to achieve this. Our estimated RPS is 900, and we will cache the query result on our side. We would like to know if there is any concern with respect to our RPS, and if so, what is the best practice. Any comments and suggestions are welcome. Thank you for your time. Best regards,Eric ______________________________ _________________ Mediawiki-api mailing list Mediawiki-api(a)lists.wikimedia. org https://lists.wikimedia.org/ mailman/listinfo/mediawiki-api

1 0

provisioning new user with API
by Scott Koranda 24 Dec '16

24 Dec '16

Hello, I am using MediaWiki 1.28.0. I want to provision new users (create new accounts) in MediaWiki using the API. The provisioning will be driven by a callout from another service (a registry that enrolls and manages users in a project using the wiki). Users do not participate directly in the provisioning. I have set up my client as an OAuth Owner-only consumer as detailed at https://www.mediawiki.org/wiki/OAuth/Owner-only_consumers I am able to request an account creation token using this code: $oauth = new OAuth( $consumerKey, $consumerSecret, OAUTH_SIG_METHOD_HMACSHA1,OAUTH_AUTH_TYPE_AUTHORIZATION ); $oauth->setToken( $accessToken, $accessSecret ); $oauth->fetch(' https://myserver/w/api.php?action=query&meta=tokens&type=createaccount&form…', null, OAUTH_HTTP_METHOD_PUT); $response = json_decode($oauth->getLastResponse(), true); When I execute that code the $response is, for example, Array ( [batchcomplete] => [query] => Array ( [tokens] => Array ( [createaccounttoken] => 8f9c1c7c2b38918cb5caac5c87dd2084585bf6c3+\ ) ) ) Note the end of the token include +\ First question: Is that form of the token, specifically having +\ at the end, correct and expected? If I then take that token and execute $createAccountToken = $response['query']['tokens']['createaccounttoken']; $oauth->fetch(" https://myserver/w/api.php?action=createaccount&format=json&name=FooBar&ema…", null, OAUTH_HTTP_METHOD_PUT); I receive {"error":{"code":"createnotoken","info":"The token parameter must be set","*":"See https://myserver/w/api.php for API usage"}} Second question: What am I doing wrong when invoking the createaccount action? I am following documentation at https://www.mediawiki.org/wiki/API:Account_creation but it is not clear to me which parts of that page may be deprecated and precisely how I should provision a new account. I appreciate any insights. Thanks, Scott K

3 8

[Mediawiki-api-announce] BREAKING CHANGE: Fixing old-style boolean response fields for formatversion=2
by Brad Jorsch (Anomie) 20 Dec '16

20 Dec '16

With formatversion=1, boolean response fields are typically returned as the empty string when true and are absent from the response when false. With formatversion=2, boolean response fields in JSON and PHP formats are supposed to be returned as native boolean true when true, and either native boolean false or absent from the response when false. Any boolean response fields that use the formatversion=1 semantics with formatversion=2 should be fixed. Please file tasks in Phabricator if you encounter such fields.[1] Clients using formatversion=2 should be prepared for such response fields to be fixed without further warning. [1]: https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?tag=MediaWiki… -- Brad Jorsch (Anomie) Senior Software Engineer Wikimedia Foundation _______________________________________________ Mediawiki-api-announce mailing list Mediawiki-api-announce(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce

1 0

[Mediawiki-api-announce] BREAKING CHANGE: Error code and other message reporting changes
by Brad Jorsch (Anomie) 06 Dec '16

06 Dec '16

With the merge of Gerrit change 321406,[1] come some breaking changes to the formatting of errors and warnings from some modules. These changes should be on Beta Labs shortly, and should be deployed to WMF wikis with 1.29.0-wmf.6. - Error codes returned by some modules have changed. Notably, codes from query submodules will no longer include the module prefix, e.g. if you supply a bad continuation parameter for prop=revisions it will report error code 'badcontinue' instead of 'rvbadcontinue'. Other codes may have changed as well, e.g. for list=allpages the error for attempting to use redirects=1 in generator mode is now 'invalidparammix' instead of 'params'. - If you're attempting to parse the human-readable 'info' text of errors, that text may have changed. Use cases should be submitted in Phabricator[2] as feature requests to include the needed data in a machine-readable format alongside the error message. - action=emailuser may return a "Warnings" status, and now returns 'warnings' and 'errors' subelements (as applicable) instead of 'message'. - action=imagerotate returns an 'errors' subelement rather than 'errormessage'. - action=move now reports errors when moving the talk page as an array under key 'talkmove-errors', rather than using 'talkmove-error-code' and 'talkmove-error-info'. The format for subpage move errors has also changed. - action=rollback no longer returns a "messageHtml" property on errors. Use errorformat=html if you're wanting HTML formatting of messages. - action=upload now reports optional stash failures as an array under key 'stasherrors' rather than a 'stashfailed' text string. - action=watch reports 'errors' and 'warnings' instead of a single 'error' This same change brings the ability to request errors and warnings in languages other than English and formats other than plain text. See the new 'errorformat', 'errorlang', and 'errorsuselocal' parameters to the 'main' module. [1]: https://gerrit.wikimedia.org/r/#/c/321406/ [2]: https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?tag=MediaWiki… -- Brad Jorsch (Anomie) Senior Software Engineer Wikimedia Foundation _______________________________________________ Mediawiki-api-announce mailing list Mediawiki-api-announce(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce

1 0

Pure Browser-side JS with Bot Password cannot get token due to CORS
by Dan Keith 04 Dec '16

04 Dec '16

Hi. I'm trying to write JS code in my browser that will run a prototype Wikidata bot (written in Browser-side JS) that will use XMLHttpRequest() to login to WikiData via a Bot Password for the purposes of making an edit. I'm running into CORS issues. If I disable CORS in my browser, I can POST with: ?action=login&format=json&lgname=BOTNAME&lgpassword=BOTPASSWORD' and I successfully get a login token. However, if I don't disable CORS, my browser errors with: 'XMLHttpRequest cannot load https://www.wikidata.org/w/api.php. Origin http://localhost:8080 is not allowed by Access-Control-Allow-Origin’ Note that the same request that fails in the browser works fine with ‘curl’. I’ve tried adding ‘origin=*’. I’ve toggled withCredentials, and I’ve tried a variety of combinations of these. I’m wondering if it is even possible to have a webpage that can obtain login access (via bot user/pw), and make WikiData edits. I know that the rest of the Wiki sites can uses CORS between each other, because they are whitelisted. My site is not on the whitelist, and it shouldn’t be. An even simpler request that fails is this: https://www.wikidata.org/w/api.php?action=query&format=json&meta=tokens It works when I disable CORS in the browser, but fails with the same error as above (remember, this is XMLHttpRequest, not just typing into the URL field). I want to avoid writing server code just to defeat the browser’s CORS protection. Thanks for any help you can provide, Dan

2 1

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Mediawiki-api December 2016