Hoi, If you are interested in collation, you may want to look into the CLDR, it is where the collations are registered per language. There is no such thing as an universally correct sorting algorithm.. NB the CLDR is a UNICODE project. Thanks, GerardM
2009/3/11 Aryeh Gregor <Simetrical+wikilist@gmail.comSimetrical%2Bwikilist@gmail.com
On Wed, Mar 11, 2009 at 6:14 AM, Daniel Kinzler daniel@brightbyte.de wrote:
There is none. Sorting is done by the database. That is to say, in the
default
"comnpatibility" mode, binary "collation" is used - that is, byte-by-byte comparison of UTF-8 encoded data. Which sucks. But we are stuck with it
until
MySQL gets proper Unicode support.
And until we upgrade to that version. MySQL 4 doesn't have *any* Unicode support -- or any character encoding support, in fact. Every is binary.
But we don't have to wait on MySQL. We would just have to store a Unicode sortkey in cl_sortkey instead of the actual Unicode characters. This would require an implementation of a Unicode sorting algorithm in MediaWiki. It could be language-specific or whatever you want.
Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l