Mediawiki-api January 2013

mediawiki-api@lists.wikimedia.org

14 participants
13 discussions

Need to extract abstract of a wikipedia page
by aditya srinivas 23 Nov '23

23 Nov '23

Hello, I am writing a Java program to extract the abstract of the wikipedia page given the title of the wikipedia page. I have done some research and found out that the abstract with be in rvsection=0 So for example if I want the abstract of 'Eiffel Tower" wiki page then I am querying using the api in the following way. http://en.wikipedia.org/w/api.php?action=query&prop=revisions&titles=Eiffel… and parse the XML data which we get and take the wikitext in the tag <rev xml:space="preserve"> which represents the abstract of the wikipedia page. But this wiki text also contains the infobox data which I do not need. I would like to know if there is anyway in which I can remove the infobox data and get only the wikitext related to the page's abstract Or if there is any alternative method by which I can get the abstract of the page directly. Looking forward to your help. Thanks in Advance Aditya Uppu

4 3

API: to REST or not to REST - the API future
by Yuri Astrakhan 26 Jan '13

26 Jan '13

After talking to some of you at the office on Friday (was great, thanks!), and reading some new suggestions (thanks Brion), it seems API has gained several distinct, slightly overlapping areas of operation. * SQL-like: This is what the API is best at - providing complex structured access to the database - various page properties and lists like "what pages link to X". The prime beneficiaries are bots (spam fighting, link/iw fixing, etc). Required improvements are fairly cosmetic in nature - minor refactoring, warning/error localization, per module versioning (see RFC <http://www.mediawiki.org/wiki/Requests_for_comment/API_Future>). * Actions (MasterDB changes): API has greatly grown in this area, providing access to most MW functionality like editing, uploading, watching, account creation, etc. Some cleanup work might still be required, i.e. better token management, but overall this portion is also fairly mature. * Content - wiki markup/html/sub-sections - this area seems to need improvements the most, due to the much wider variety and style of uses and the potential server load. For example, mobile requires quick access to rendered individual sections of an article, a web site might embed the page header, a javascript clients might want some pieces of information specific to the given site - text from some table, template parameters, and many other - this list *needs to be expanded* for better planning, ideally at the RFC page<http://www.mediawiki.org/wiki/Requests_for_comment/API_Future> . Unlike query, content-access rarely requires multi-source aggregated results, yet has a much higher usage load, making it the primary target for caching. In my opinion, this difference is the root of misunderstanding between the REST and the current API style advocates. This brings us to how we want to grow the "content API", assuming that caching is by far the most important concern: * JSON-only output * Minimum number of parameters * URL rewriting Most improvements for querying and actions are listed at http://www.mediawiki.org/wiki/Requests_for_comment/API_Future and are currently being worked on, but the content approach needs use cases, requests, and interface suggestions.

1 0

Re: [Mediawiki-api] [Wikitech-l] Reporting errors from the API
by Brad Jorsch 25 Jan '13

25 Jan '13

I'm going to copy this to the mediawiki-api list, too. On Fri, Jan 25, 2013 at 8:01 AM, Daniel Kinzler <daniel(a)brightbyte.de> wrote: > Hi all! > > Wikidata (technically, Wikibase) uses a lot of JS/API based editing, and we have > several times hit upon the question of how to best report errors from the API. > I'll try to break the issue down into several concrete questions. But first off, > the status quo as I understand it: > > * errors are reported using an error code (a string) and a free form error > message. The message is usually not internationalized, though sometimes it is. > * warnings are reported as free form text. > * Additional information can be added to both errors and warnings, but there is > no standard way to do this. > * Errors exposed by the API are often not generated but just passed through by > the API; Typically, a generic error code is used with the original error message > (e.g. from an exception). Sounds generally correct. Anything coming out of the API internationalized is probably either being passed through from something else or being generated in an extension. > So, here are my questions: > > * Should error messages returned by the API be translated? Or should the > translation be left to JavaScript in the client? > ** In both cases, it would be nice to have a consistent relationship between > error codes and the corresponding system message. > ** If translation is done on the client, we need to pass any message parameters > separately. > ** The message key would have to somehow be derived or mapped from the error code. It would be nice to have internationalized and parameterized error messages from the API. The problem is that if we want to do anything without waiting for "API version 2", we want to avoid as much as possible breaking backwards compatibility with existing clients. Which probably means that we'll want to add a parameter for the client to specify the new style errors and warnings; this can double as selecting the language the errors should be returned in. I'd have to look at what existing code does as far as errors/warnings before making a more concrete proposal. > * When using system messages to translate the error codes from the API, these > messages will often contain wikitext. How can we best avoid this? Wikitext is > likely to be quite useless to the client - it would be better to return HTML; or > pass all the message keys and parameters, and let the client generate the message. I'd suggest that messages actually returned by the API should be in plain text, and should ''not'' use the MediaWiki namespace. In terms of the Message class, $msg->useDatabase( false )->text(). This makes things sensible for bots and such; they will often be writing errors to a log file or showing them in some user interface where HTML parsing is probably not available. The message key and parameters should (optionally?) be returned for the client to format as HTML or whatever. And the client is welcome to use the MediaWiki namespace. This is sensible for Javascript user interface and such. Unfortunately, it seems that the message key prefix "api-error-" is already in use. > * Status objects are often used to collect errors and warnings the occur while > trying to perform some task. It would be nice if the API would provide a > standard way to put the contents of a Status object into the result (well, at > least the errors and warnings). It seems it does, at least sort of: ApiResult has a convertStatusToArray() method. > Any thoughts on that? > > -- daniel

3 2

Can't login to MW 1.71 - "WrongToken"
by Levin Magruder 25 Jan '13

25 Jan '13

I'm trying to use the API in Mediawiki 1.17. I can't get logged in. I am getting a "WrongToken". I think I have the right lgtoken and session cookie, but I've never done any http programming, can anyone tell from the sequence below what I have wrong. Probably/hopefully it's some simple newbie mistake in not understanding how to handle a cookie or make a session not expire. The user/password combo is able to login thru the normal interactive web interface. Thanks Levin -------------------------------------------------------------------------------- Begin Attempt -------------------------------------------------------------------------------- POST /mw/api.php HTTP/1.1 Cache-Control: no-cache Pragma: no-cache User-Agent: Java/1.7.0_09 Host: wiki.readytheory.com Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2 Connection: keep-alive Content-type: application/x-www-form-urlencoded Content-Length: 59 format=xml&action=login&lgname=mybot&lgpassword=pwd -------------------------------------------------------------------------------- First response: -------------------------------------------------------------------------------- HTTP/1.1 200 OK Date: Thu, 24 Jan 2013 04:04:01 GMT Server: Apache/2.2.23 (Amazon) X-Powered-By: Mono Set-Cookie: wikidb_session=8foa9hj555b6re8mri4ajd7qi3; path=/; HttpOnly Cache-Control: private Content-Length: 162 Connection: close Content-Type: text/xml; charset=utf-8 <?xml version="1.0"?><api><login result="NeedToken" token="0e8287f8976207131b153ca2acf25cfb" cookieprefix="wikidb" sessionid="8foa9hj555b6re8mri4ajd7qi3" /></api> -------------------------------------------------------------------------------- SECOND request with cookie -------------------------------------------------------------------------------- POST /mw/api.php HTTP/1.1 Cookie: wikidb_session=8foa9hj555b6re8mri4ajd7qi3 Cache-Control: no-cache Pragma: no-cache User-Agent: Java/1.7.0_09 Host: wiki.readytheory.com Accept: text/html, image/gif, image/jpeg, *; q=.2, */*; q=.2 Connection: keep-alive Content-type: application/x-www-form-urlencoded Content-Length: 99 format=xml&action=login&lgname=mybot&lgpassword=pwd&lgtoken=0e8287f8976207131b153ca2acf25cfb -------------------------------------------------------------------------------- second response, where I get "WrongToken" -------------------------------------------------------------------------------- HTTP/1.1 200 OK Date: Thu, 24 Jan 2013 04:04:01 GMT Server: Apache/2.2.23 (Amazon) X-Powered-By: Mono Cache-Control: private Content-Length: 61 Connection: close Content-Type: text/xml; charset=utf-8 <?xml version="1.0"?><api><login result="WrongToken" /></api>

3 4

[Mediawiki-api-announce] API module version information being removed from API help, action=paraminfo
by Brad Jorsch 18 Jan '13

18 Jan '13

Since it has been broken since the move to git[1] and is unlikely to have been particularly useful anyway, I've just merged a change[2] to remove the version information (and the 'version' parameter) from the API help. This may result in a warning about an unrecognized parameter. Also, the 'version' property in the results from action=paraminfo will now be the empty string. It was not removed completely at this time to avoid breaking clients where accessing an undefined property is a fatal error; however, it should be considered deprecated and may be removed in a future version. Note that version information for MediaWiki as a whole is still available via prop=siteinfo.[3] [1]: https://bugzilla.wikimedia.org/show_bug.cgi?id=35885 [2]: https://gerrit.wikimedia.org/r/#/c/43622/ [3]: e.g. https://en.wikipedia.org/w/api.php?action=query&meta=siteinfo -- Brad Jorsch Software Engineer Wikimedia Foundation _______________________________________________ Mediawiki-api-announce mailing list Mediawiki-api-announce(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce

1 0

[Mediawiki-api-announce] Some "badcontinue" errors changing in 1.21wmf8
by Brad Jorsch 15 Jan '13

15 Jan '13

For badcontinue errors, the API currently returns a code of either "??badcontinue" or "??_badcontinue" (where ?? is the module's prefix), depending on the module. The human-readable error message text may also vary depending on the module. In 1.21wmf8,[1] these will be standardized as "??badcontinue", with a human-readable message of "Invalid continue param. You should pass the original value returned by the previous query". Since well-behaved API clients should never see this error message, this change should not require any client updates. [1]: Gerrit change https://gerrit.wikimedia.org/r/#/c/42398 -- Brad Jorsch Software Engineer Wikimedia Foundation _______________________________________________ Mediawiki-api-announce mailing list Mediawiki-api-announce(a)lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api-announce

1 0

Internal error in ApiResult::setElement
by Eric Sun 11 Jan '13

11 Jan '13

Hi, Something changed in the last couple of days that causes a bunch of API calls to fail when grabbing image dimensions. This error is consistently generated from the URL below: {"servedby":"mw72","error":{"code":"internal_api_error_MWException","info":"Exception Caught: Internal error in ApiResult::setElement: Attempting to add element imagerepository=shared, existing value is shared","*":""}} http://en.wikipedia.org/w/api.php?titles=File:Convento_Cristo_Decemebr_2008… Requesting the data separately, i.e. http://en.wikipedia.org/w/api.php?titles=File:Convento_Cristo_Decemebr_2008… and http://en.wikipedia.org/w/api.php?titles=File:Convento_Cristo_Decemebr_2008… give the correct result. Other multi-queries like http://en.wikipedia.org/w/api.php?titles=File:Convento_Cristo_Decemebr_2008… also consistently work fine. What's going on here? Thanks, Eric

2 1

APIv2 alpha implementation
by Yuri Astrakhan 09 Jan '13

09 Jan '13

It's almost New Year, and time for presents (according to the Russian traditions). I hereby present you the API versioning framework. Navigate to https://gerrit.wikimedia.org/r/#/c/41014/ and get that patch to your local installation for a test run. There shouldn't be any visible functionality changes. At least that was the goal. But now we can easily add a new version module or a submodule (prop/list/etc) without breaking existing code. To see it in action -- in ApiMain.php, add this line into $Modules array: 'query2' => 'ApiQuery', it adds another version of query implemented by the same existing class. now if you look at http://localhost/api.php you will see * action=query * This module is obsolete. See old documentation at api.php?action=help&modules=query * action=query2 * ... -- the regular query parameters -- ... both query and query2 will work identically in this example unless you change ApiQuery.php. Let the rotten egg throwing commence! Once we settle on this, I propose you join me at the mediawiki labs -- https://labsconsole.wikimedia.org/wiki/Nova_Resource:Mediawiki-api to develop and test the actual changes to the APIv2 interface. To keep it going full steam ahead, I will accept ideas, criticisms, cookies, code patches, small unmarked bills and large paychecks, but most importantly - your time analyzing and commenting this effort. --Yurik

4 12

binary param
by webmaster＠numerica.cl 08 Jan '13

08 Jan '13

Hello again. I've got another question using the api: One of the params my api receives is the binary representation of an hexadecimal value (the info hash), this parmeter arrives spilled to the $params array. Diving down the debug, I found that the responsible is getGPCVal method in WebRequest.php, which makes the string go through $data = $wgContLang->checkTitleEncoding( $data ); [l.340] and $data = $this->normalizeUnicode( $data ); [l.343] being therefore altered twice. I don't know much of Unicode normalization, but I do understand the problem here is it shouldn't be applied on binary data... integer type won't do though, and there is not much more choice... or maybe api types can be extended? I'm curious about checkTitleEncoding on the other hand. What is it meant for? Could I be lacking mb_check_encoding function and that's why it falls back to 8bit encoding? I can always use good old $_GET, but I feel I'd rather not. Is it really problematic? Thanks in advance for any ideas. Numerico

2 2

BREAKING CHANGE: Format of action=query&list=logevents&leprop=details changing
by Brad Jorsch 07 Jan '13

07 Jan '13

Due to bug 43221 and Gerrit change I1a3c7ac6, the format of the "details" for some log events will be changing. In particular, any event with details that were returned with a key resembling "4::foo" will now be returned under a key of simply "foo". This change will be deployed with 1.21wmf7, which is scheduled for today on mediawiki.org, test.wikipedia.org, and wikidata; 7 January for non-Wikipedia sites; 9 January for enwiki; and 14 January for the other Wikipedias. -- Brad Jorsch Software Engineer Wikimedia Foundation

3 3

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

Mediawiki-api January 2013