[Mediawiki-api-announce] BREAKING CHANGE: Sortkeys in list=categorymembers and prop=categories are now hexadecimal

Roan Kattouw roan.kattouw at gmail.com
Sun Apr 17 13:23:18 UTC 2011


As of r86257 [1], which will be deployed to Wikimedia wikis soon and
will be included in the 1.17 release, sortkeys output by
list=categorymembers and prop=categories are now encoded as
hexadecimal strings, so "FOO" becomes "464f4f".

As previously announced, sortkeys are no longer guaranteed to be
human-readable, and may in fact contain binary data (this will happen
when Wikimedia switches to the UCA/ICU collation). However, outputting
binary data, notably in XML, was problematic [2], so I decided to use
hexadecimal encoding. This means the sortkey as returned by the API is
now guaranteed to not be human-readable, even if the underlying
collation uses a human-readable format (such as the uppercase
collation currently in use on Wikimedia wikis). However, it will still
sort correctly: if A sorts before B in the binary format, that will
also be the case in the hexadecimal format.

The following things changed:
* The 'sortkey' property in list=categorymembers and prop=categories
is now a hexadecimal string
* In prop=categories , clprop=sortkey will now also output the
'sortkeyprefix' property (human-readable part of the sortkey).
list=categorymembers already provided this through
cmprop=sortkeyprefix
* The format of cmcontinue has changed from type|pageid|rawsortkey to
type|hexsortkey|pageid . If you did not make any assumptions about the
format of cmcontinue and just passed back whatever you got in
query-continue, this won't affect you

Roan Kattouw (Catrope)

[1] https://secure.wikimedia.org/wikipedia/mediawiki/wiki/Special:Code/MediaWiki/86257
[2] https://bugzilla.wikimedia.org/show_bug.cgi?id=28541



More information about the Mediawiki-api-announce mailing list