Kaixo!
On Mon, Apr 11, 2005 at 09:02:24AM -0400, zhengzhu wrote:
With Monk's help, I have implemented a preliminary system for conversion between cyrillics and lation for BE. There is a test site at
very nice indeed, a lot of languages should benefit of such feature.
However, languages that have latin script in their list show that there are various exceptions to implement:
- html entities (starting with '&' and ending with ';') must not be converted. - interwiki links should be handled as exceptions too (as the list of valid interwiki domains is known (eg, the possible xx in [[xx:foo]]) it should be easy to implement - it would be nice also to detect urls and not convert them
then, it would be need to have a way to force conversion even for things otherwise being exceptions (that is, the opposite of -{ }- ); and the very nice thing would be a way to suggest the appropriate conversion for a given string (for example, foreign people names could be written differently in cyrillic and latin, maybe something like "blablabla ={Latn:Saratxaga|Cyrl:Сарачага}= blabla", that would be displayed as "blablabla Saratxaga blaba" or "блаблаблабла Сарачага блаблабла" but not as "блаблабла Саратхага блаблабла" maybe the Latn:/Cyrl: could be removed, as the script can be found from the strings, syntax will then be easier for the editors: "blablabla ={Saratxaga|Сарачага}= blabla" or, if they write in cyrillic: "блаблаблабла ={Сарачага|Saratxaga}= блаблабла"
Maybe we can start to create the transliteration rules for some languages and put them somewhere (on meta?)