Hi!
I have been always told that we develop and implent open source in order to create open content using open standards. In my opinion you have not provided any argument why any other approach is preferable. In this case the CLDR is an applicable open standard.
I wonder why you call it 'a standard', markup is standard, data is not. This is what Wikipedia says:
"The Common Locale Data Repository Project, often abbreviated as CLDR, is a project of the Unicode Consortium to provide locale data in the XML format for use in computer applications. CLDR contains locale specific information that an operating system will typically provide to applications. "
I'd prefer using binary sort, then we don't have to change anything, and everything is done extremely efficient :-) Anyway, there're lots and lots of implementation details/problems.
It is easy to point at collection of data, it is not that easy to merge it into production environment, handle data conflicts, staging, etc. Do you want to get few people fulltime working just on this?
Shrug, Domas