Delirium provides a good reference - yes, something we can agree on! :)
However, I'd take issue with the convertor being considered "impossible by many experts." If a human being, with expert knowledge, can map it across, certainly it is not in the domain of "impossible" tasks.
In fact, ZH Wikipedia could be a leader in the development of such a system, and create a modular open source solution for others to use too. The "power of many" can certainly be seen in this task, which has a variety of special cases for mapping based on context. The mapping system itself could be run like a wiki, so people can contribute and alter the rules as needed, as colloquial grammar and usage changes with time.
The challenge now is getting enough critical mass and developers for ZH.
-Andrew Lih (User:Fuzheado)
On Tue, 22 Jun 2004 10:50:04 -0500, Delirium delirium@hackish.org wrote:
Roozbeh Pournader wrote:
I don't want to get into the debate, but just FYI, such a convertor is considered impossible by many experts. By impossible, I mean impossible in the level of a perfect German to English to German machine translation software. Refer to Unicode mailing list and its archives for more details.
Just to provide some more concrete points of reference:
Jack Halpern and Jouni Kerman. "The Pitfalls and Complexities of Chinese to Chinese Conversion". Proceedings of the 14th International Unicode Conference, Cambridge, Massachusetts, USA, March 1999. [http://www.basistech.com/papers/chinese/c2c.html]
The biggest issue seems to be that certain simplified characters map to one of multiple traditional characters, depending on context. Thus a translator has to know the context, which requires solution of some fairly daunting natural-language processing problems. Conversion from traditional to simplified seems to be much easier, as it typically collapses the total number of characters used.
-Mark
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l