Hi all!
It seems that during the current efforts to modernize the API, the default language (as represented by $wgLang) was changed to always be the content language [1][2] instead of the user language. This may not be a problem for most of the core modules, but broke some essential wikibase API modules (which relied on the actual user language being the default). We propose to change the default behavior back to what it was [3].
I think however that it would be good to first discuss how exactly we WANT the API to behave.
First off, I'd like to point out that there are two distinct use cases for API calls, or rather, two contexts in which API output is shown to a user:
1) Bots. Error messages are shown on the command line. The bot is potentially interacting with multiple wikis at once.
2) UI. The web interface uses the API to implement user interactions. Feedback/error messages are shown to the user directly. In some cases (notably for Wikibase), the language is also relevant for parsing user input and formatting output (think dates, numbers).
The "API help" use case is a bit in between the two, but since it's meant for bot authors, I'd count it under (1); Help shown on Special:ApiSandbox, however, would count under (2).
Now, let's look at the three options we have for the default language in the API:
a) The user language according to user preferences. This used to be the case (technically), it's what wikibase relies on, and it's what [3] reverts to.
b) Hardcoded to "en". This was the de-facto standard for error/help messages, since API messages are/were hard-coded in English.
c) The content language. This is what [1] did.
Option (a) used to be the status quo, and seems intuitive: if I set a default language, I want to use that default language everywhere. API is an integral part of the UI and should behave as such, so it's right for (2).
There is something to be said for option (b) for use case (1): if I run a bot on 100 wikis, I don't want to go and set the user language on all the wikis. Getting error messages in english seems a good compromise (assuming people who can program need to understand at least basic english anyway), and it's what people are used to.
I personally see no advantage in option (c): For use case (1), it means I have to put the uslang parameter in ALL API requests, otherwise I'll see errors in the local language of each of the 100 wikis. For use case (2), it means I get output in the content language instead of my user language - this is just inconsistent, and particularly annoying on multilingual wikis like commons.
The current behavior, option (c), was introduced in the context of localizing the API self-documentation page. If you consider that to be "content", it makes a certain amount of sense, but it breaks a ton of other use cases. Was this even intentional?
I recommend to return to option (a). We can still cover use case (1) by keeping the english message in the current place in the response, and adding the localized message under a different key (and perhaps make it optional). But keep in mind that errors are not the only locale-sensitive output of the API - some API modules generate HTML for display to the user.
What do you think?...
[1] https://gerrit.wikimedia.org/r/#/c/160798/ [2] The goal of the change in question was apparently to support the uselang parameter for API modules, which is a good thing. [3] https://gerrit.wikimedia.org/r/#/c/170895/
The disadvantage of using the user language is that it reduces cacheability: all responses have to be marked anon-public-user-private instead of public where public would otherwise be allowed. True, there are a number of other reasons that an API response may not be cached, but adding one more to the pile doesn't improve the situation.
When a client really does need output in the user language, it seems simple enough for that client to specify "&uselang=user". I note various JavaScript in mediawiki/core has already been specifying uselang with the value from mw.config.get( 'wgUserLanguage' ) even before uselang was officially recognized by the API.
I note that, while adding uselang as an official parameter was done along with the help change, it was only done then because uselang was on my list anyway and without it I would have had to add uselang to both action=help and action=paraminfo (only to remove it later).
It's simple enough for clients. It's still a breaking change, though, and every client has to be touched. For example, right now I try to work with ApiSandbox, and it does not support or use uselang yet. Sure, it's easy enough to fix, but there are lots of clients out there, and most of them are not under our control.
I get the caching argument, though. I wonder if we couldn't just mark requests without uselang parameter anon-public-user-private. That way we would gradually get the caching benefits without breaking anything.
Am 04.11.2014 16:26, schrieb Brad Jorsch (Anomie):
The disadvantage of using the user language is that it reduces cacheability: all responses have to be marked anon-public-user-private instead of public where public would otherwise be allowed. True, there are a number of other reasons that an API response may not be cached, but adding one more to the pile doesn't improve the situation.
My complaint related to the user language, which is only relevant for logged in users - which means web caching doesn't apply anyway. I don't see how caching is relevant here.
When a client really does need output in the user language, it seems simple enough for that client to specify "&uselang=user".
Yes, but this is still a breaking change that should have been announced and discussed first.
By the way - once we support localized error messages (soon, I hope), does this change mean that bots will always see localized error messages, unless they add the uselang parameter to all requests?
I note that, while adding uselang as an official parameter was done along with the help change, it was only done then because uselang was on my list anyway and without it I would have had to add uselang to both action=help and action=paraminfo (only to remove it later).
I'm all for supporting uselang. I' just against changing the default.
The new behavior is inconsistent with the behavior of the main entry point and UI, and breaks existing clients. The caching seems rather hypothetical to me (it seems more realistic to go for a proper REST API for that). Please restore the old behavior.
-- daniel
On Tue, Nov 4, 2014 at 11:25 AM, Daniel Kinzler daniel@brightbyte.de wrote:
By the way - once we support localized error messages (soon, I hope), does this change mean that bots will always see localized error messages, unless they add the uselang parameter to all requests?
No. The plan there is that errors and warnings will be output in English (and in the old format for warnings) unless specifically requested otherwise.
Considering that mobile relies on the API for a large percentage of it's UI and currently serves about half of our traffic, I'm pretty surprised this change was not discusses with the mobile team. I'm on a bus right now and haven't been able to test things thoroughly, but it looks like some parts of the mobile interface are no longer properly localized for logged in users, e.g. notifications. I'll look into this further when I get to the office.
Ryan Kaldari
On Nov 4, 2014, at 8:57 AM, "Brad Jorsch (Anomie)" bjorsch@wikimedia.org wrote:
On Tue, Nov 4, 2014 at 11:25 AM, Daniel Kinzler daniel@brightbyte.de wrote:
By the way - once we support localized error messages (soon, I hope), does this change mean that bots will always see localized error messages, unless they add the uselang parameter to all requests?
No. The plan there is that errors and warnings will be output in English (and in the old format for warnings) unless specifically requested otherwise.
-- Brad Jorsch (Anomie) Software Engineer Wikimedia Foundation _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l
wikitech-l@lists.wikimedia.org