Answer me this question: Are you a traditional or a simplified user, primarily? Your e-mail address ends in .cn so I am just assuming you are primarily a Simplified user, please correct me if I am wrong.
Yes, I’m from Beijing, a mainlander. But just as you said, how is my *nationality* any more relevant to /this particular issue/ than the number of pets I have or my favourite colours?
The reason there was only one active contributor is because zh-tw: was not being advertised at all. I assume if I had really looked for contributors there would have been at least one or two other people working on it w/me.
Yes, if you setup zh-tw, people will go there and write articles. But the split of the community will only weaken the growth of the small project and bring more difficult in the future. Suppose two project zh-cn and zh-tw now, and someday we want to synchronize them, and you will find it is very difficult.
Yes, if you don’t want synchronize the two, there is no problem. But why we write the same things twice, we just have the same language. It’s really true there are some terminology are different in technical and pop cultural fields. But how about the same part of zh-tw and zh-cn? We have the same universe, the sun, the planet, species, and same mathematic, logic, and the same thousands-years history. Because these knowledge are formed into their modern shape mostly before 1950s when Jiang’s government retreat to Taiwan. Even in pop cultural fields, I don’t think the difference is so big. Please notice the fact that A-mei has more fans in mainland than in Taiwan.
On Sep 10, 2004, at 4:36 PM, yuanml wrote:
Yes, if you setup zh-tw, people will go there and write articles. But the split of the community will only weaken the growth of the small project and bring more difficult in the future. Suppose two project zh-cn and zh-tw now, and someday we want to synchronize them, and you will find it is very difficult.
Yes, if you don’t want synchronize the two, there is no problem.
This is not the case - the two writing systems in the same documents do cause problems - merely different problems. The advantage of a technical fix to the character problem to have one version is that the problems don't grow with time, where as diverged versions do. The problem set of making two characters sets work isn't a fast moving target, where as keeping up with two sets of wikipedians is.
But the difference between the two isn't merely a "difference of character sets". Rather than converting on the level of the individual character which will inevitably produce poor results, it is nessecary to convert documents on the level of lexemes, for which one needs some sort of artificial intelligence capable of separating Chinese texts into individual lexemes before conversion. It is also nessecary to convert names of countries, special terminology (including Wikipedia terminology: the first two characters in the Simplified Chinese name for "wikipedia" would be translated alone into English as the name "Vicky", which would be converted into Traditional in a specific way, but the current way to write "wikipedia" in Traditional Chinese is not like that), etc; also Simplified Chinese is more tolerant of the usage of English words in the Roman alphabet than is Traditional (except perhaps in Hong Kong where anglicisms are often even more frequent) as is exemplified by various article texts.
Some people here are saying that "if I read this text in simplified aloud, a Taiwanese person can understand it". That is not the issue at hand. If zh: were in Pinyin, perhaps, that would be the issue, or if it was a spoken encyclopedia, maybe. But this is a written encyclopedia. zh-cn: and zh-tw: may be largely the same spoken language, but they are hardly the same written language.
--Jin Junshu/Mark
On Fri, 10 Sep 2004 16:58:39 -0400, Stirling Newberry stirling.newberry@xigenics.net wrote:
On Sep 10, 2004, at 4:36 PM, yuanml wrote:
Yes, if you setup zh-tw, people will go there and write articles. But the split of the community will only weaken the growth of the small project and bring more difficult in the future. Suppose two project zh-cn and zh-tw now, and someday we want to synchronize them, and you will find it is very difficult.
Yes, if you don't want synchronize the two, there is no problem.
This is not the case - the two writing systems in the same documents do cause problems - merely different problems. The advantage of a technical fix to the character problem to have one version is that the problems don't grow with time, where as diverged versions do. The problem set of making two characters sets work isn't a fast moving target, where as keeping up with two sets of wikipedians is.
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
On Sep 10, 2004, at 5:15 PM, Mark Williamson wrote:
But the difference between the two isn't merely a "difference of character sets". Rather than converting on the level of the individual character which will inevitably produce poor results, it is nessecary to convert documents on the level of lexemes, for which one needs some sort of artificial intelligence capable of separating Chinese texts into individual lexemes before conversion.
Having done document conversion, the number of cases is managable here, and belong in the realm of "things that can be searched for and tagged. A cumbersome, but not hard, problem.
It is also nessecary to convert names of countries, special terminology (including Wikipedia terminology:
True, but again an enumerateable change, solveable in software, or by the same process that we have now for proofing documents: namely people go through and make intelligent changes. The process would be to flip the toggle, scan the document for problems and then edit the underlying wiki-material, inserting the metacodes needed. Much as people now scan documents to find broken, redirected or ambiguated links, spelling errors and so on.
the first two characters in the Simplified Chinese name for "wikipedia" would be translated alone into English as the name "Vicky", which would be converted into Traditional in a specific way, but the current way to write "wikipedia" in Traditional Chinese is not like that), etc; also Simplified Chinese is more tolerant of the usage of English words in the Roman alphabet than is Traditional (except perhaps in Hong Kong where anglicisms are often even more frequent) as is exemplified by various article texts.
That's a dialectical, not linguistic, issue.
Some people here are saying that "if I read this text in simplified aloud, a Taiwanese person can understand it". That is not the issue at hand. If zh: were in Pinyin, perhaps, that would be the issue, or if it was a spoken encyclopedia, maybe. But this is a written encyclopedia. zh-cn: and zh-tw: may be largely the same spoken language, but they are hardly the same written language.
--Jin Junshu/Mark
The general consensus of linguists is that you are overstating the differences - that traditional and simplified represent the same "written" language because the grammar is the same, most of the syntax is the same. The visual difference is rather like the difference between using the Latinate Greek characters, the one most people associate with greek, and the older characters used in the classical age. A person who can read one can't read the other, but translation between the two is mainly a mechanical process that needs intervention occassionally. While the traditional/simplified problem is a couple of orders of magnitude more complicated, it isn't more complex in lexical theory.
Which is not to minimize the differences - if the community consensus is just "squash this!" then that is a mistake as larger as simply brute force creating two versions. There are technical and methodological hurdles that should be addressed, otherwise someone will reach the same conclusion that Jin Junshu has - namely that a traditional Wiki is needed, because there is a user community not well served by the simplified version. Part of this is based on political forces that are in operation out there: there is no desire among the Chinese reading and writing community to break chinese into separate written languages - that is to continue increasing differences until mutual intelligibility is a difficult hurdle to pass. At the same time, there is a desire among traditional users to continue to use traditional characters, and there is a larger corpus of texts, many of them fundamental texts, which exist as originals in traditional characters, and which argue for wiki handling traditional characters in a appropriate way.
On Sat, 11 Sep 2004 04:36:19 +0800, yuanml yuanml@pku.org.cn wrote:
Answer me this question: Are you a traditional or a simplified user, primarily? Your e-mail address ends in .cn so I am just assuming you are primarily a Simplified user, please correct me if I am wrong.
Yes, I'm from Beijing, a mainlander. But just as you said, how is my *nationality* any more relevant to /this particular issue/ than the number of pets I have or my favourite colours?
Your *nationality* is only partially relevant, what's relevant here is that you are primarily a Simplified user. Just as we would not let fr: decide whether or not wa: should be a separate Wikipedia, it makes no sense to allow Simplified users to decide whether or not there should be a separate Wikipedia for Traditional.
The reason there was only one active contributor is because zh-tw: was not being advertised at all. I assume if I had really looked for contributors there would have been at least one or two other people working on it w/me.
Yes, if you setup zh-tw, people will go there and write articles. But the split of the community will only weaken the growth of the small project and bring more difficult in the future. Suppose two project zh-cn and zh-tw now, and someday we want to synchronize them, and you will find it is very difficult.
So what? One could make the same arguments for not having separate Wikipedias for different languages.
Yes, if you don't want synchronize the two, there is no problem. But why we write the same things twice, we just have the same language. It's really true there are some terminology are different in technical and pop cultural fields. But how about the same part of zh-tw and zh-cn? We have the same universe, the sun, the planet, species, and same mathematic, logic, and the same thousands-years history. Because these knowledge are formed into their modern shape mostly before 1950s when Jiang's government retreat to Taiwan. Even in pop cultural fields, I don't think the difference is so big. Please notice the fact that A-mei has more fans in mainland than in Taiwan.
...
English people and Japanese people also have the same universe, sun, planet, species, maths, logic, and universal history, yet we have separate Wikipedias for English and Japanese...
By your logic, there shouldn't even be a zh: and we should only have one Wikipedia (which would probably be en: although I would much prefer is: or lb: or something of that sort)
--Jin Junshu/Mark
wikipedia-l@lists.wikimedia.org