I agree that this is a good idea, and in certain cases it could be done automatically for all occurances of a word (that's how it works on zh.wikipedia right now; you can do exclusions though if you want).
However, conversion into Cantonese from BH ("Baihua", the current written standard used for Chinese based on the Mandarin speech of the Beijing area) has a few more difficulties than just that.
For example:
K'öi4 pei3 sa:m1-pun3 sü1 ngo4. = Ta1 gei3 wo3 san1ben3 shu1. ("He gave me three books.")
The Cantonese is "he give three-COUNTER book me". The Mandarin is "he give me three-COUNTER book." A basic difference in word order.
Ngo4 höi5 ka:i1 ma:i4 ye4 sin1. -> Wo3 xian1 shang4 jie1 mai3 dong1xi. ("I'm going to the market to buy some things beforehand.")
The Cantonese is "I go market buy things before". The Mandarin is "I before go market buy things." (again, the different word order)
K'öi4 kou1-kwo5 ngo4. -> Ta1 bi3 wo3 gao1. ("He's taller than am I.")
The Cantonese is "he tall pass me". The Mandarin is "he compare me tall".
Kiu5 k'öi4 loi2. -> Ba3 ta1 jiao4 lai2. ("Ask him to come.")
The Cantonese is "call him come". The Mandarin is "take him call come".
Ngo4 höi5 Pak7king1. -> Wo3 shang4 Bei3jing1 qu4. ("I'm going to Beijing.")
The Cantonese is "I go Beijing". The Mandarin is "I up Beijing go."
M2 t'ai3-tak7-kin5 -> kan4-bu2-jian4 ("Can't see!")
The Cantonese is "not look can see". The Mandarin is "look not see".
Nei4 sik8 fa:n6 m2 sik8? -> Ni3 chi1 bu4 chi1 fan4? ("Do you eat rice?")
The Cantonese is "you eat rice not eat". The Mandarin is "you eat not eat rice".
----
As you can see these fundamental differences in the very base of the language make it impossible with present technology to automatically translate accurately between Cantonese and Mandarin.
Mark
On Sat, 25 Dec 2004 18:12:44 -0500, Stirling Newberry stirling.newberry@xigenics.net wrote:
On Dec 25, 2004, at 5:57 PM, Mark Williamson wrote:
Stirling, I have a minor question here,
With this "conversion", do you mean conversion of terms between different dialects of Cantonese?
I believe that wikimedia should make this capability available, and then let the communities decide how it is to be used within their own context. It would allow editors in Cantonese to decide to edit in the standard written Chinese wikipedia, but include Cantonese as an enrichment, it would also allow dialectical differences within a Cantonese wikipedia should one be established. It doesn't mandate either solution.
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l