The "dialect" question is a very difficult one to answer and the
creation of zh-min-nan: has already made ripples in the zh: community.
The difference between Minnan and other "dialects" though is that, as
far as I'm aware, none of the other Chinese dialects/Sinitic languages
has a large movement to switch to a different writing system.
First and foremost this problem could be looked at in terms of Cantonese.
Modern Cantonese actually has two different versions, one that is just
reading text written for Mandarin speakers but with Cantonese
readings, the other being using Cantonese grammar and vocabulary words
that Cantonese has but Mandarin doesn't.
Until very recently the latter had the higher status in Hong Kong and
Macau, however upon reunification the former gained the higher status.
Most Cantonese speakers, even if they don't know Mandarin, can read
texts written by a Mandarin speaker with little difficulty, but much
of the sentences are not how they would say them in everyday speech.
Then there is also an issue with Classical Chinese which is very
different from modern Mandarin. Until very recently any sort of
reference work like an encyclopedia would've been written in Classical
Chinese which was the literary language.
There may be some movement to start a Classical Chinese Wikipedia but
if there is it must be very small.
However Classical Chinese sentences often seem more natural in
Cantonese or Hakka or other Southern dialects than do the equivalents
in written Mandarin.
Also if you were to convert zh-min-nan: into Chinese characters it
would become apparent very quickly that it wasn't Mandarin, especially
because Mandarin uses such words as 的 (de) which many people say is
"bastardized classical Chinese" because originally 的 was created
exclusively for writing Mandarin, the character properly used for
Taiwanese and Classical Chinese is 之 (as you can see 之 is a basic
character, but 的 has two different parts).
--金俊書/Mark
On Tue, 14 Sep 2004 20:43:29 +0100, Rowan Collins
<rowan.collins(a)gmail.com> wrote:
Just a
question out of curiosity about how you handle this: what's the
base language, or is there one? Is the primary version of a document in
Simplified, and then there are annotations for how to correctly
translate it to Traditional (i.e. [simplified character|proper
traditional character]), or is it the other way around, or are both
Simplified and Traditional equally base languages?
Glancing at the current test implementation, I gather that neither has
'precedence': to put a manual translation, you say {-zh-cn one version
zh-tw the other-}. Wether the mappings are one-to-many in one
direction, the other, or both, is not a problem: you simply define
what you want displayed, in that particular case, in both versions.
What I'm not clear on, having not looked in any depth, is how the
article is actually *stored*. The explanation seems to imply that the
characters are simply recognised as being either:
a) in the desired writing system; no action needed
b) in the non-desired writing system; automatic translation required
or c) marked up as a special case; version chosen to match preference
as per special syntax
I may be wrong, but if I'm right this obviously no use to the more
general case of languages/dialects. For that, you'd probably need to
store which language the 'original' was in in the database, and then
convert based on that. Although then you'd have the problem of changes
that weren't easily translatable back to that base, wouldn't you? i.e.
base is LangA, a LangB user makes a change, but that change is
ambiguous in LangA; how is that change recorded? Similarly, if a naive
LangB user "corrects" the automated translation, they may end up
creating an error in the LangA document, because they overwrote the
original rather than adding special syntax. Ouch. It's more
complicated than I expected, unless that's just cos I'm hungry... ;)
--
Rowan Collins BSc
[IMSoP]
_______________________________________________
Wikipedia-l mailing list
Wikipedia-l(a)Wikimedia.org
http://mail.wikipedia.org/mailman/listinfo/wikipedia-l