But the difference between the two isn't merely a "difference of character sets". Rather than converting on the level of the individual character which will inevitably produce poor results, it is nessecary to convert documents on the level of lexemes, for which one needs some sort of artificial intelligence capable of separating Chinese texts into individual lexemes before conversion. It is also nessecary to convert names of countries, special terminology (including Wikipedia terminology: the first two characters in the Simplified Chinese name for "wikipedia" would be translated alone into English as the name "Vicky", which would be converted into Traditional in a specific way, but the current way to write "wikipedia" in Traditional Chinese is not like that), etc; also Simplified Chinese is more tolerant of the usage of English words in the Roman alphabet than is Traditional (except perhaps in Hong Kong where anglicisms are often even more frequent) as is exemplified by various article texts.
Some people here are saying that "if I read this text in simplified aloud, a Taiwanese person can understand it". That is not the issue at hand. If zh: were in Pinyin, perhaps, that would be the issue, or if it was a spoken encyclopedia, maybe. But this is a written encyclopedia. zh-cn: and zh-tw: may be largely the same spoken language, but they are hardly the same written language.
--Jin Junshu/Mark
On Fri, 10 Sep 2004 16:58:39 -0400, Stirling Newberry stirling.newberry@xigenics.net wrote:
On Sep 10, 2004, at 4:36 PM, yuanml wrote:
Yes, if you setup zh-tw, people will go there and write articles. But the split of the community will only weaken the growth of the small project and bring more difficult in the future. Suppose two project zh-cn and zh-tw now, and someday we want to synchronize them, and you will find it is very difficult.
Yes, if you don't want synchronize the two, there is no problem.
This is not the case - the two writing systems in the same documents do cause problems - merely different problems. The advantage of a technical fix to the character problem to have one version is that the problems don't grow with time, where as diverged versions do. The problem set of making two characters sets work isn't a fast moving target, where as keeping up with two sets of wikipedians is.
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l