I read much technical details on this thread on how collation and
sorting is extremely complex.
I hereby admit, that I don't understand all of the database dependencies
and collation nifles or whatever else may be the limiting factors that
play a role here. Perhaps I shouldn't participate in a tech discussion
that I don't fully understand, but take me for a wiki user who spends
many hour to add defaultsort statements to articles and doesn't
understand why the software cannot do it by itself. Perhaps you can shed
some more light on it for a dummie like me.
Here is, what I in my simple mind think, how it would be solvable (I'm
sure my thoughts are too simple, but I want to understand, why and in
what way they are too simple) . As an example I take the German language:
Take the pagename and make it uppercase (could be lowercase too, but
uppercase seems better as the first letter will show up in the
category). str_replace "Ä" with "A", "Ö" with "O",
"Ü" with "U" and "ß"
with "SS". Also str_replace other Latin characters with diacritics with
their counterpart without diacritic. And that's our sortkey. This very
simple procedure should reduce the number of necessary defaultsorts
(except for articles about persons) by about 90% in the German wikipedia.
Implement these steps directly in the software and it should fix the
sorting of categories. I read much about uniqueness in the thread, but
defaultsort isn't unique either.
Of course it only works for languages where the unicode byte order of
the basic script correspondends with the sorting order. But a solution
helping 80% of the languages in 80% of all cases (and with no
disadvantages for the other 20%) is better than a solution that helps
100% of all languages in 100% of all cases, but that does not exist yet,
doesn't it?
Marcus Buck
User:Slomox