[Wikipedia-l] Re: One Chinese Wikipedia

Rowan Collins rowan.collins at gmail.com
Tue Sep 14 19:43:29 UTC 2004

> Just a question out of curiosity about how you handle this: what's the
> base language, or is there one?  Is the primary version of a document in
> Simplified, and then there are annotations for how to correctly
> translate it to Traditional (i.e. [simplified character|proper
> traditional character]), or is it the other way around, or are both
> Simplified and Traditional equally base languages?

Glancing at the current test implementation, I gather that neither has
'precedence': to put a manual translation, you say {-zh-cn one version
zh-tw the other-}. Wether the mappings are one-to-many in one
direction, the other, or both, is not a problem: you simply define
what you want displayed, in that particular case, in both versions.

What I'm not clear on, having not looked in any depth, is how the
article is actually *stored*. The explanation seems to imply that the
characters are simply recognised as being either:
a) in the desired writing system; no action needed
b) in the non-desired writing system; automatic translation required
or c) marked up as a special case; version chosen to match preference
as per special syntax

I may be wrong, but if I'm right this obviously no use to the more
general case of languages/dialects. For that, you'd probably need to
store which language the 'original' was in in the database, and then
convert based on that. Although then you'd have the problem of changes
that weren't easily translatable back to that base, wouldn't you? i.e.
base is LangA, a LangB user makes a change, but that change is
ambiguous in LangA; how is that change recorded? Similarly, if a naive
LangB user "corrects" the automated translation, they may end up
creating an error in the LangA document, because they overwrote the
original rather than adding special syntax. Ouch. It's more
complicated than I expected, unless that's just cos I'm hungry... ;)

Rowan Collins BSc

More information about the Wikipedia-l mailing list