"Tomasz" == Tomasz Wegrzanowski <taw(a)users.sourceforge.net> writes:
On Mon, Feb 21, 2005 at 09:22:17AM -0800, Ray
> Petr Kadlec wrote:
> >Note that there is a bugreport about that in Bugzilla:
> >You can at least throw in a vote. :-)
> I can sympathize with the idea, but one has to keep in mind that the
> sort order should vary between one language and another. In English we
> would alphebetize "æ" as though it were "ae" while Danish treats
it as a
> separate letter tagged on at the end of the alphabet.
We can do a lot better than Unicode binary sort order.
The sort order should not vary between Latin-script
because it should be script-dependent, not language-depedent (sorted
words don't have to come from the same language, but merely have to
be in the same script - think proper names). It's very unfortunate
that there are different traditions for sorting Latin writing
In most languages the right place for base letter X
with diacritical mark Y
is somewhere after plain letter X (and we can chose order of Ys that generates
Given that danish have com up in the discussion, I will hasten to
point out that the danish letter "å" is interchangeable with "aa". In
/some/ cases. For Instace, the german city Aachen has a collation
order in the start of any list, where as the danish city Aalborg
(Ålborg) comes at the end of any collation.
In short, nothing is as simple as it seems.
In some it sorts the same way as X, in which case
sorting it after X
is still much better than at the end of alphabet.
The languages with such letters at the end, or with
orders are few, and we won't break any more than we currently do if
we adapt "base, then base+diacritics" sorting.
Probably not, but the danes are still going to complain about those
two examples, no matter what's done.
mailto:email@example.com - Invitationer på FCFS basis