But the difference between the two isn't merely a "difference of
character sets". Rather than converting on the level of the individual
character which will inevitably produce poor results, it is nessecary
to convert documents on the level of lexemes, for which one needs some
sort of artificial intelligence capable of separating Chinese texts
into individual lexemes before conversion. It is also nessecary to
convert names of countries, special terminology (including Wikipedia
terminology: the first two characters in the Simplified Chinese name
for "wikipedia" would be translated alone into English as the name
"Vicky", which would be converted into Traditional in a specific way,
but the current way to write "wikipedia" in Traditional Chinese is not
like that), etc; also Simplified Chinese is more tolerant of the usage
of English words in the Roman alphabet than is Traditional (except
perhaps in Hong Kong where anglicisms are often even more frequent) as
is exemplified by various article texts.
Some people here are saying that "if I read this text in simplified
aloud, a Taiwanese person can understand it". That is not the issue at
hand. If zh: were in Pinyin, perhaps, that would be the issue, or if
it was a spoken encyclopedia, maybe. But this is a written
encyclopedia. zh-cn: and zh-tw: may be largely the same spoken
language, but they are hardly the same written language.
--Jin Junshu/Mark
On Fri, 10 Sep 2004 16:58:39 -0400, Stirling Newberry
<stirling.newberry(a)xigenics.net> wrote:
On Sep 10, 2004, at 4:36 PM, yuanml wrote:
Yes, if you setup zh-tw, people will go there and write articles.
But the split of the community will only weaken the growth of the
small project
and bring more difficult in the future. Suppose two project zh-cn and
zh-tw now,
and someday we want to synchronize them, and you will find it is very
difficult.
Yes, if you don't want synchronize the two, there is no problem.
This is not the case - the two writing systems in the same documents do
cause problems - merely different problems. The advantage of a
technical fix to the character problem to have one version is that the
problems don't grow with time, where as diverged versions do. The
problem set of making two characters sets work isn't a fast moving
target, where as keeping up with two sets of wikipedians is.
_______________________________________________
Wikipedia-l mailing list
Wikipedia-l(a)Wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/wikipedia-l