[Wikipedia-l] Re: One Chinese Wikipedia

zhengzhu zhengzhu at gmail.com
Tue Sep 14 19:43:54 UTC 2004

On Tue, 14 Sep 2004 15:10:10 -0400, Delirium <delirium at hackish.org> wrote:

> Just a question out of curiosity about how you handle this: what's the
> base language, or is there one?  Is the primary version of a document in
> Simplified, and then there are annotations for how to correctly
> translate it to Traditional (i.e. [simplified character|proper
> traditional character]), or is it the other way around, or are both
> Simplified and Traditional equally base languages?

This is a problem in general, but I think a minor one for the case of
Simplified/Traditional Chinese. In short, there is no "base" language,
the wikitext can in fact be mixed, using both Simplified and
Traditional characters. Here is the long explanation:

Out of about 5000 to 6000 commonly used Chinese characters, about half
of them (~2600) have different Simplified/Traditional forms. However
the difference is very regular; there are pretty clear rules on how
one maps to another. You can think of this in English as, for example,
always change Simplified character begining with "sh" to "ch" to get
the Traditional form (i.e. ship -> chip, sheep->cheep, etc.) There are
a few exceptions, but most of the time these rules work. As a result,
there would be little difficulty for a Chinese editor to recognize
both Simplified and Traditional characters, regardless of his or her
native language. Plus, the editor must have read the (automatically)
translated article first, which should contain far less unfamilar
characters. I imagine it would be no more difficult to locate the
place one wants to correct than say, to make changes to a table.

gmail.com at zhengzhu

More information about the Wikipedia-l mailing list