Both of these changes should be deployed to WMF wikis with 1.28.0-wmf.17, see https://www.mediawiki.org/wiki/MediaWiki_1.28/Roadmap for the schedule.

== Alternative multi-value separator ==

With the merging of Gerrit change 305126,[1] if a value for a multi-valued parameter must contain pipe characters (U+007C, "|"), it will now be possible to use the Unit Separator character (U+001F) instead. As this character is not otherwise valid input for any strings in MediaWiki, its use here should not conflict with any valid input. To signal that you're using this feature, the whole multi-value parameter must also be prefixed with the Unit Separator character.

For example, you can now use meta=allmessages to parse the "search-rewritten" message with the text "foo|bar" as $1 and "foo|baz" as $2:
https://en.wikipedia.beta.wmflabs.org/w/api.php?action=query&meta=allmessages&ammessages=search-rewritten&amenableparser=1&amargs=%1Ffoo|bar%1Ffoo|baz

Client libraries should consider updating to use this feature when asked to send a multi-valued parameter with one or more values containing pipe characters.


== Unicode normalization warnings ==

The API has always expected input as NFC-normalized Unicode represented as UTF-8. Non-NFC-normalized UTF-8 input would be silently normalized, and in the query string non-UTF-8 input would be interpreted as being in a fallback encoding (such as Windows-1252) and converted to Unicode. This sometimes led to subtle bugs when input was unexpectedly converted.

With the merging of Gerrit change 306491,[2] the API will now issue a warning when input was subject to such conversion, which is hoped to make it more obvious to clients when their input was subject to conversion.

When this happens in the 'titles' parameter for ApiPageSet-using modules, the response will also include the conversion of each title in the existing 'normalized' element. Since the API result cannot directly represent non-normalized data, these entries will have the 'from' element percent-encoded and a 'fromencoded' boolean will be included alongside to indicate this. This normalization step is separate from the existing Title normalization (uppercasing the first letter and replacing underscores with spaces), so two entries may be generated in the 'normalized' element.

See https://en.wikipedia.beta.wmflabs.org/w/api.php?action=query&titles=a%CC%8A&formatversion=2 for an example showing the new warning and normalization entries.


 [1]: https://gerrit.wikimedia.org/r/#/c/305126/
 [2]: https://gerrit.wikimedia.org/r/#/c/306491/


--
Brad Jorsch (Anomie)
Senior Software Engineer
Wikimedia Foundation