Re: [Wikitech-l] More aggressive DEFAULTSORT

13 May 2009

      Hoi,
The introduction demonstrates that Unicode indeed deals with collation.
When you look at the characters in Unicode, you will find that the Unicode
UTF-8 standard is very much a work in progress. When you look at the CLDR
you will find that it is also very much a work in progress HOWEVER, for many
languages the collation has been well defined and is unlikely to change.
When you look at the CLDR for African languages, there is a project called
Afrigen where they are collecting the relevant information necessary to
include it into the CLDR.
I am not impressed by your argument that you will have to rebuild the
sorting order when there are indeed changes to a collation order. First of
all standards like the CLDR know releases so these iterations only happen
when a new release becomes available and second of all it seems weird to me
to refuse to implement an improved collation order when it is wrong in the
first place.
I have been always told that we develop and implent open source in order to
create open content using open standards. In my opinion you have not
provided any argument why any other approach is preferable. In this case the
CLDR is an applicable open standard.
When as a consequence of an improved collation order for particular
languages we have to rebuild databases every now and again, then it is tough
but it needs to be done. It is all part of normal and acceptable system
management.
Thanks,
      GerardM
http://o2.it46.se/afrigen/statistics.php
2009/5/13 Domas Mituzas midom.lists@gmail.com
...
Hi!
...
http://www.unicode.org/reports/tr10/#Introduction
This is not CLDR, this is general collation algorithm.
...
http://cldr.unicode.org/index/cldr-spec/collation-guidelines
CLDR is a repository/process for LDMLs (thats what I referred to
people sending us that data, in case current is wrong/not existing).
Currently it has mistakes and multiple versions even for same locales

doesn't seem to be too stable nor correct.

An example:
http://unicode.org/cldr/data/common/collation/lt.xml?rev=1.26&content-ty...
 ;-)
Do note, that such unstable changes require database rebuilds at each
iteration.. So, we'd have to have someone reviewing it all, comparing
with different sources, and then pushing it once every few years into
some data staging environment where we do data conversions all the
time? :) riiight...
Domas

Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

2003

2002

Re: [Wikitech-l] More aggressive DEFAULTSORT