Hello I am a Cantonese speaker from Hong Kong, and I am interested to set up a Wikipedia based on Cantonese. Cantonese is spoken by around 70 to 80 million people, in Hong Kong, Macau, the Chinese province of Guangdong, and many Chinese communities in Europe, North America and Southeast Asia. Attached below is the information of the language on Enthnologue.org.
David Chang
information from http://www.ethnologue.org/show_language.asp?code=YUH
CHINESE, YUE: a language of China SIL code: YUH ISO 639-1: zh ISO 639-2(B): chi ISO 639-2(T): zho
Population 52,000,000 in mainland China, 4.5% of the population (1984). Includes 498,000 in Macau. Population total all countries 71,000,000 (1999 WA). Region Spoken in Guangdong (except for the Hakka speaking areas especially in the northeast, the Min Nan speaking areas of the east, at points along the coast as well as Hainan Island), Macau, and in the southern part of Guangxi. Also possibly in Laos. Also spoken in Australia, Brunei, Canada, Costa Rica, Honduras, Indonesia (Java and Bali), Malaysia (Peninsular), Mauritius, Nauru, Netherlands, New Zealand, Panama, Philippines, Singapore, South Africa, Thailand, United Kingdom, USA, Viet Nam. Alternate names YUET YUE, GWONG DUNG WAA, CANTONESE, YUE, YUEH, YUEYU, BAIHUA Dialects YUEHAI (GUANGFU, HONG KONG CANTONESE, MACAU CANTONESE, SHATOU, SHIQI, WANCHENG), SIYI (SEIYAP, TAISHAN, TOISAN, HOISAN, SCHLEIYIP), GAOLEI (GAOYANG), QINLIAN, GUINAN. Classification Sino-Tibetan, Chinese. Comments The Guangzhou variety is considered the standard. Subdialects of Yuehai are Xiangshan, spoken around Zhongshan and Shuhai, and Wanbao around Dongwan City and Bao'an County. Official language. Grammar. SVO; prepositions; genitives, relatives after noun heads; articles, adjectives, numerals before noun heads; word order mainly distinguishes subjects, objectives, indirect objects; passives usually indicated by adding a word in front of the verb; tonal. Outside of mainland China, many Cantonese-specific characters are used in the writing system. TV. Bible 1894-1981.
Also spoken in: Brunei Language name CHINESE, YUE Population 3,500 in Brunei, 6.93% of ethnic Chinese (1979). Alternate names YUE, YUEH, CANTONESE Comments Bible 1894-1981. See main entry under China.
Costa Rica Language name CHINESE, YUE Population 4,500 including Mandarin and Hakka speakers (1981 MARC). Alternate names YUE, YUEH, CANTONESE Comments Bible 1894-1981. See main entry under China.
Indonesia (Java and Bali) Language name CHINESE, YUE Population 180,000 in Indonesia (1982 CCCOWE). Alternate names CANTONESE, YUE, YUEH Comments Bible 1894-1981. See main entry under China.
Malaysia (Peninsular) Language name CHINESE, YUE Population 748,010 in Malaysia, including 704,286 in Peninsular Malaysia, 24,640 in Sarawak, 19,184 in Sabah (1980 census). Alternate names CANTONESE, YUE, YUEH Dialects CANTONESE, TOISHANESE. Comments Bible 1894-1981. See main entry under China.
Panama Language name CHINESE, YUE Alternate names YUE, YUEH, CANTONESE Comments Bilingualism in Spanish. Merchants. Bible 1894-1981. See main entry under China.
Philippines Language name CHINESE, YUE Population 6,000 to 7,200 or 1.2% of Chinese population (1982 CCCOWE). Comments Bible 1894-1981. See main entry under China.
Singapore Language name CHINESE, YUE Population 314,000 speakers in Singapore (1985), 12.3% of the population, out of 338,000 in the ethnic group (1993). Alternate names CANTONESE, YUE, YUEH, GUANGFU Comments Bible 1894-1981. See main entry under China.
Thailand Language name CHINESE, YUE Population 29,400 in Thailand, .5% of Chinese-speaking Chinese in Thailand (1984 estimate). Alternate names CANTONESE, YUE, YUEH Comments Bible 1894-1981. See main entry under China.
Viet Nam Language name CHINESE, YUE Population 900,000 in Viet Nam (1993 Dang Nghiem Van). Alternate names SUÒNG PHÓNG, QUANG DONG, HAI NAM, HA XA PHANG, MINH HUONG, CHINESE NUNG, NUNG, LOWLAND NUNG, HOA, HAN, TRIÈU CHAU, PHÚC KIÉN, LIEM CHAU, SAMG PHANG Comments Renowned fighters. Came from Canton, China as railroad workers and soldiers several decades ago. They are not the same as the Nung in the Tai family or the Tibeto-Burman Nung (Nu) of China and Myanmar. Chinese calligraphy. Daoist, Christian. Bible 1894-1981. See main entry under China.
Entries from the SIL Bibliography about this language: Huang Yuanwei. 1997. "The interaction between Zhuang and the Yue (Cantonese) dialects."
Shepherd. 2000. "Messages from a treasure box."
Ah, well, I guess that answers my question.
If you have a method you would write it in, then...
I see no problem here, but undoubtedly it will be cast in an extremely negative light by some of the users on zh: who will probably oppose it on the grounds that it will take oh-so-many valuable users away from the oh-so-poor zh.wikipedia.
Before somebody brings that argument up, let me address it.
zh.wikipedia is already over 10000 articles. There are still ongoing issues with the complex situation of Chinese in relation to Wikipedia, although this was somewhat alleviated by the fact that automatic sc<>tc conversion has been implemented (in addition, it will adjust for country-specific terminology, having Mainland, Taiwan, Hong Kong, and Singapore).
If a Cantonese person wants to start a separate Cantonese Wikipedia, even after they see the advantages and disadvantages, I think that to deny them is like denying a Nynorsk Wikipedia.
However, the issue of Chinese varieties (I won't use the terms language or dialect here as they will bring up political issues and start unnessecary side-arguments) is still very complex. The Minnan Wikipedia barely made it in, and that was *after* they had proven themselves as a healthy separate site (Holopedia) with over 200 articles.
Some people might argue that all people who speak Cantonese can read baihua, but the same argument applies to other populations as well, for example how many Basque monolinguals are there? But they have a Wikipedia because they have a right (well, they don't "have a right", but you know what I mean)
Mark
On Thu, 23 Dec 2004 05:20:12 +0800, David Chang cdelacreme@yahoo.fr wrote:
Hello I am a Cantonese speaker from Hong Kong, and I am interested to set up a Wikipedia based on Cantonese. Cantonese is spoken by around 70 to 80 million people, in Hong Kong, Macau, the Chinese province of Guangdong, and many Chinese communities in Europe, North America and Southeast Asia. Attached below is the information of the language on Enthnologue.org.
David Chang
information from http://www.ethnologue.org/show_language.asp?code=YUH
CHINESE, YUE: a language of China SIL code: YUH ISO 639-1: zh ISO 639-2(B): chi ISO 639-2(T): zho
Population 52,000,000 in mainland China, 4.5% of the population (1984). Includes 498,000 in Macau. Population total all countries 71,000,000 (1999 WA). Region Spoken in Guangdong (except for the Hakka speaking areas especially in the northeast, the Min Nan speaking areas of the east, at points along the coast as well as Hainan Island), Macau, and in the southern part of Guangxi. Also possibly in Laos. Also spoken in Australia, Brunei, Canada, Costa Rica, Honduras, Indonesia (Java and Bali), Malaysia (Peninsular), Mauritius, Nauru, Netherlands, New Zealand, Panama, Philippines, Singapore, South Africa, Thailand, United Kingdom, USA, Viet Nam. Alternate names YUET YUE, GWONG DUNG WAA, CANTONESE, YUE, YUEH, YUEYU, BAIHUA Dialects YUEHAI (GUANGFU, HONG KONG CANTONESE, MACAU CANTONESE, SHATOU, SHIQI, WANCHENG), SIYI (SEIYAP, TAISHAN, TOISAN, HOISAN, SCHLEIYIP), GAOLEI (GAOYANG), QINLIAN, GUINAN. Classification Sino-Tibetan, Chinese. Comments The Guangzhou variety is considered the standard. Subdialects of Yuehai are Xiangshan, spoken around Zhongshan and Shuhai, and Wanbao around Dongwan City and Bao'an County. Official language. Grammar. SVO; prepositions; genitives, relatives after noun heads; articles, adjectives, numerals before noun heads; word order mainly distinguishes subjects, objectives, indirect objects; passives usually indicated by adding a word in front of the verb; tonal. Outside of mainland China, many Cantonese-specific characters are used in the writing system. TV. Bible 1894-1981.
Also spoken in: Brunei Language name CHINESE, YUE Population 3,500 in Brunei, 6.93% of ethnic Chinese (1979). Alternate names YUE, YUEH, CANTONESE Comments Bible 1894-1981. See main entry under China.
Costa Rica Language name CHINESE, YUE Population 4,500 including Mandarin and Hakka speakers (1981 MARC). Alternate names YUE, YUEH, CANTONESE Comments Bible 1894-1981. See main entry under China. Indonesia (Java and Bali) Language name CHINESE, YUE Population 180,000 in Indonesia (1982 CCCOWE). Alternate names CANTONESE, YUE, YUEH Comments Bible 1894-1981. See main entry under China. Malaysia (Peninsular) Language name CHINESE, YUE Population 748,010 in Malaysia, including 704,286 in Peninsular Malaysia, 24,640 in Sarawak, 19,184 in Sabah (1980 census). Alternate names CANTONESE, YUE, YUEH Dialects CANTONESE, TOISHANESE. Comments Bible 1894-1981. See main entry under China. Panama Language name CHINESE, YUE Alternate names YUE, YUEH, CANTONESE Comments Bilingualism in Spanish. Merchants. Bible 1894-1981. See main entry under China. Philippines Language name CHINESE, YUE Population 6,000 to 7,200 or 1.2% of Chinese population (1982 CCCOWE). Comments Bible 1894-1981. See main entry under China. Singapore Language name CHINESE, YUE Population 314,000 speakers in Singapore (1985), 12.3% of the population, out of 338,000 in the ethnic group (1993). Alternate names CANTONESE, YUE, YUEH, GUANGFU Comments Bible 1894-1981. See main entry under China. Thailand Language name CHINESE, YUE Population 29,400 in Thailand, .5% of Chinese-speaking Chinese in Thailand (1984 estimate). Alternate names CANTONESE, YUE, YUEH Comments Bible 1894-1981. See main entry under China. Viet Nam Language name CHINESE, YUE Population 900,000 in Viet Nam (1993 Dang Nghiem Van). Alternate names SUÒNG PHÓNG, QUANG DONG, HAI NAM, HA XA PHANG, MINH HUONG, CHINESE NUNG, NUNG, LOWLAND NUNG, HOA, HAN, TRIÈU CHAU, PHÚC KIÉN, LIEM CHAU, SAMG PHANG Comments Renowned fighters. Came from Canton, China as railroad workers and soldiers several decades ago. They are not the same as the Nung in the Tai family or the Tibeto-Burman Nung (Nu) of China and Myanmar. Chinese calligraphy. Daoist, Christian. Bible 1894-1981. See main entry under China.
Entries from the SIL Bibliography about this language: Huang Yuanwei. 1997. "The interaction between Zhuang and the Yue (Cantonese) dialects."
Shepherd. 2000. "Messages from a treasure box." _______________________________________________ Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
Hello,
Mark Williamson wrote:
If you have a method you would write it in, then...
I see no problem here, but undoubtedly it will be cast in an extremely negative light by some of the users on zh: who will probably oppose it on the grounds that it will take oh-so-many valuable users away from the oh-so-poor zh.wikipedia.
*cough* I'm a Cantonese speaker from Hong Kong and I don't think we should. There's the issue that Mark already brought up, what method should we write it in: romanization has never been standardized and traditional Chinese but with Cantonese words is, well, redundant...
Some people might argue that all people who speak Cantonese can read baihua, but the same argument applies to other populations as well, for example how many Basque monolinguals are there? But they have a Wikipedia because they have a right (well, they don't "have a right",
Well, just because we can doesn't mean we should. ;)
little Alex
Hello List,
I completely agree with Alex Kwan. Having a Wikipedia in both Mandarin/simplified Chinese and Cantonese/traditional Chinese seems redundant to me as well. One could perhaps compare it with setting up extra Wikipedias for British and American English (I doubt that the differences are much bigger; except for the trad/simp issue).
That was my two yuán. - André
-----Ursprüngliche Nachricht----- Von: wikipedia-l-bounces@Wikimedia.org [mailto:wikipedia-l-bounces@Wikimedia.org] Im Auftrag von Alex Kwan Gesendet: Freitag, 24. Dezember 2004 08:48 An: wikipedia-l@Wikimedia.org Betreff: Re: [Wikipedia-l] Propose to set up a Cantonese Wikipedia
Hello,
Mark Williamson wrote:
If you have a method you would write it in, then...
I see no problem here, but undoubtedly it will be cast in an extremely negative light by some of the users on zh: who will probably oppose it on the grounds that it will take oh-so-many valuable users away from the oh-so-poor zh.wikipedia.
*cough* I'm a Cantonese speaker from Hong Kong and I don't think we should. There's the issue that Mark already brought up, what method should we write it in: romanization has never been standardized and traditional Chinese but with Cantonese words is, well, redundant...
Some people might argue that all people who speak Cantonese can read baihua, but the same argument applies to other populations as well, for example how many Basque monolinguals are there? But they have a Wikipedia because they have a right (well, they don't "have a right",
Well, just because we can doesn't mean we should. ;)
little Alex _______________________________________________ Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
Hello,
André Müller wrote:
I completely agree with Alex Kwan. Having a Wikipedia in both Mandarin/simplified Chinese and Cantonese/traditional Chinese seems redundant to me as well. One could perhaps compare it with setting up extra Wikipedias for British and American English (I doubt that the differences are much bigger; except for the trad/simp issue).
Ack! You've completely mistaken my reasons why I'm opposed to it. I was afraid that we couldn't find a way to truly show the "Cantonese-ness" of the wikipedia and I'm very much an all-or-nothing person.
But the variations between Cantonese and Mandarin is very different from the simplistic picture you've painted and to delve into it right now is beyond my time limit.
little Alex
Oh... I'm sorry, than I have understood you wrong. :( I also seemed to have a much different impression of Cantonese.
- now even smaller André
Alex Kwan wrote:
Ack! You've completely mistaken my reasons why I'm opposed to it. I
was
afraid that we couldn't find a way to truly show the "Cantonese-ness"
of
the wikipedia and I'm very much an all-or-nothing person.
But the variations between Cantonese and Mandarin is very different
from
the simplistic picture you've painted and to delve into it right now
is
beyond my time limit.
little Alex
Hello,
Sorry to reply so late, but I was in mainland China.
André Müller wrote:
Oh... I'm sorry, than I have understood you wrong. :( I also seemed to have a much different impression of Cantonese.
- now even smaller André
Sorry, didn't mean to sound as brusque as I was. I had no intention of making you feel bad or anything. :(
little Alex
André Müller wrote:
I completely agree with Alex Kwan. Having a Wikipedia in both Mandarin/simplified Chinese and Cantonese/traditional Chinese seems redundant to me as well. One could perhaps compare it with setting up extra Wikipedias for British and American English (I doubt that the differences are much bigger; except for the trad/simp issue).
I tend to agree. Node is well known as an activist for forking Chinese into multiple projects, and so his comments should be considered in that full context.
--Jimbo
Excuse me? Yes, I want to split up the Chinese Wikipedia into multiple projects, no matter what lines it's upon! Mwahahahahaha!!!
Actually, no. With the upgrade to 1.4, most previous problems on zh.wikipedia have been fixed. I did submit a request to Wikicities for a separate project focused more on anthropology but still an encyclopedia in Traditional Chinese, but the two are unrelated.
Perhaps you should actually consider that I have /given you some evidence/?
This issue has been discussed in a little bit of depth here, by not just me but others as well. And you're basically ignoring what I'm saying and labelling it as activist because I like to fork zh: (again, wth!?), and you're also ignoring what some other people are saying.
If a new Wikipedia *is* created for Cantonese (remember, this *was* requested by a Cantonese speaker, not me), I think ultimately it should be in Chinese characters but it should have the same Traditional-Simplified solution zh.wikipedia now uses as there are speakers of Cantonese on the mainland who use the Simplified alphabet too.
As far as I can tell, you have considered NONE of the evidence I have presented and simply skipped over it for whatever reason and then labelled it for others as "activism for forking Chinese into multiple projects".
Well, I have news for YOU, Jimbo. Check your inbox. Stirling Newberry agrees with me. And what he said is largely the same as what I said. So where does that put things?
Mark
On Fri, 24 Dec 2004 14:17:48 -0800, Jimmy (Jimbo) Wales jwales@wikia.com wrote:
André Müller wrote:
I completely agree with Alex Kwan. Having a Wikipedia in both Mandarin/simplified Chinese and Cantonese/traditional Chinese seems redundant to me as well. One could perhaps compare it with setting up extra Wikipedias for British and American English (I doubt that the differences are much bigger; except for the trad/simp issue).
I tend to agree. Node is well known as an activist for forking Chinese into multiple projects, and so his comments should be considered in that full context.
--Jimbo _______________________________________________ Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
Node, Jimbo simply said, "your comments should be considered in that full context."
I tend to agree with Jiaqing Bao, that a Cantonese Wikipedia would be an interesting curiosity, but as with Min-nan, it seems destined to whither on the vine. The norm in Guangdong and Hong Kong is to use baihua for formal writing. It's better to have their efforts poured into the "traditional Chinese" content of ZH. It works both ways too - as HK and others learn to adapt to simplified and doing business in the mainland, folks in the PRC are rekindling interest in traditional Chinese characters too.
The Cantonese dialect does have unique colorful phrases and a different linguistic culture that manifests itself in Cantopop, film and cartoons. Some would seem foreign to "Mandarin" speakers. It would be great to have these Cantonese-isms captured in some way that could be done in a combined ZH Wikipedia.
-Andrew (User:Fuzheado)
On Fri, 24 Dec 2004 18:53:06 -0700, Mark Williamson node.ue@gmail.com wrote:
Excuse me? Yes, I want to split up the Chinese Wikipedia into multiple projects, no matter what lines it's upon! Mwahahahahaha!!!
Actually, no. With the upgrade to 1.4, most previous problems on zh.wikipedia have been fixed. I did submit a request to Wikicities for a separate project focused more on anthropology but still an encyclopedia in Traditional Chinese, but the two are unrelated.
Perhaps you should actually consider that I have /given you some evidence/?
This issue has been discussed in a little bit of depth here, by not just me but others as well. And you're basically ignoring what I'm saying and labelling it as activist because I like to fork zh: (again, wth!?), and you're also ignoring what some other people are saying.
If a new Wikipedia *is* created for Cantonese (remember, this *was* requested by a Cantonese speaker, not me), I think ultimately it should be in Chinese characters but it should have the same Traditional-Simplified solution zh.wikipedia now uses as there are speakers of Cantonese on the mainland who use the Simplified alphabet too.
As far as I can tell, you have considered NONE of the evidence I have presented and simply skipped over it for whatever reason and then labelled it for others as "activism for forking Chinese into multiple projects".
Well, I have news for YOU, Jimbo. Check your inbox. Stirling Newberry agrees with me. And what he said is largely the same as what I said. So where does that put things?
Mark
On Fri, 24 Dec 2004 14:17:48 -0800, Jimmy (Jimbo) Wales jwales@wikia.com wrote:
André Müller wrote:
I completely agree with Alex Kwan. Having a Wikipedia in both Mandarin/simplified Chinese and Cantonese/traditional Chinese seems redundant to me as well. One could perhaps compare it with setting up extra Wikipedias for British and American English (I doubt that the differences are much bigger; except for the trad/simp issue).
I tend to agree. Node is well known as an activist for forking Chinese into multiple projects, and so his comments should be considered in that full context.
--Jimbo _______________________________________________ Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
The Cantonese dialect does have unique colorful phrases and a different linguistic culture that manifests itself in Cantopop, film and cartoons. Some would seem foreign to "Mandarin" speakers. It would be great to have these Cantonese-isms captured in some way that could be done in a combined ZH Wikipedia.
-Andrew (User:Fuzheado)
I'm again going to suggest that some sort of "dialectical bracketting" be implemented, which would allow for phrases to be marked as being in a dialect of the language. Syntax would be something like [[dialect: base phrase | dialect: dialect phrase | dialect: dialect phrase]] and then allow users to pick the dialect of the language they are using. This would allow for everything from the differences between British and American English, to far more distant written dialects.
Stirling Newberry wrote:
The Cantonese dialect does have unique colorful phrases and a different linguistic culture that manifests itself in Cantopop, film and cartoons. Some would seem foreign to "Mandarin" speakers. It would be great to have these Cantonese-isms captured in some way that could be done in a combined ZH Wikipedia.
-Andrew (User:Fuzheado)
I'm again going to suggest that some sort of "dialectical bracketting" be implemented, which would allow for phrases to be marked as being in a dialect of the language. Syntax would be something like [[dialect: base phrase | dialect: dialect phrase | dialect: dialect phrase]] and then allow users to pick the dialect of the language they are using. This would allow for everything from the differences between British and American English, to far more distant written dialects.
IIRC there was stiff opposition for this last time because it'd confuse newcomers too much.
John Lee ([[en:User:Johnleemk]])
On Dec 25, 2004, at 10:31 AM, John Lee wrote:
Stirling Newberry wrote:
The Cantonese dialect does have unique colorful phrases and a different linguistic culture that manifests itself in Cantopop, film and cartoons. Some would seem foreign to "Mandarin" speakers. It would be great to have these Cantonese-isms captured in some way that could be done in a combined ZH Wikipedia.
-Andrew (User:Fuzheado)
I'm again going to suggest that some sort of "dialectical bracketting" be implemented, which would allow for phrases to be marked as being in a dialect of the language. Syntax would be something like [[dialect: base phrase | dialect: dialect phrase | dialect: dialect phrase]] and then allow users to pick the dialect of the language they are using. This would allow for everything from the differences between British and American English, to far more distant written dialects.
IIRC there was stiff opposition for this last time because it'd confuse newcomers too much.
John Lee ([[en:User:Johnleemk]]) __
It's a trade off between reader confusiong, from dialectical inconsistencies, and editor confusion. Of these two, the latter is easier to solve, because educating editors is within the reach of Wikimedia, where as changing readers is not. This syntax is no more difficult than the use of macros, image links and considerably less complex than tables.
Stirling Newberry wrote:
It's a trade off between reader confusiong, from dialectical inconsistencies, and editor confusion. Of these two, the latter is easier to solve, because educating editors is within the reach of Wikimedia, where as changing readers is not. This syntax is no more difficult than the use of macros, image links and considerably less complex than tables.
The problem is that in English it will be greatly confusing and irritatingly convoluted. Even simple words will have to be written using the syntax (i.e. centre or center? Billion or millard? Flavour or flavor?). Imagine editing an article with this sort of thing. I don't think the trade-off is worth it, at least for different English "dialects".
John Lee ([[en:User:Johnleemk]])
On Dec 25, 2004, at 11:01 AM, John Lee wrote:
Stirling Newberry wrote:
It's a trade off between reader confusiong, from dialectical inconsistencies, and editor confusion. Of these two, the latter is easier to solve, because educating editors is within the reach of Wikimedia, where as changing readers is not. This syntax is no more difficult than the use of macros, image links and considerably less complex than tables.
The problem is that in English it will be greatly confusing and irritatingly convoluted. Even simple words will have to be written using the syntax (i.e. centre or center? Billion or millard? Flavour or flavor?). Imagine editing an article with this sort of thing. I don't think the trade-off is worth it, at least for different English "dialects".
Less work than editting mathematical equations or tables by far. And less work that reorganizing and moving pages.
As for othographic dialect changes this is not an objection: there is already a tradition:simplified substitution that does a similar level of translation. The process would be to have a "reverse bot" which would find where users have made orthological dialect differences, substitute them for the base dialect word, and then put them in machine translation.
It's not conceptually difficult, whether it is what people want to do is another, but it is certain "worth it" for readability and consistency.
Stirling, I have a minor question here,
With this "conversion", do you mean conversion of terms between different dialects of Cantonese? Or do you mean conversion between Cantonese and Mandarin? The former is certainly technically possible and definitely a good idea (although a written standard based on the speech of a large population center might be a simpler idea), but the later is not going to work in the forseeable future because in addition to terminology, there are also differences of grammar in Cantonese and Mandarin, which in some cases includes word-order.
I think it's very important to make this distinction in this discussion because it is potentially very confusing to the uninitiated.
Mark
On Sat, 25 Dec 2004 11:14:49 -0500, Stirling Newberry stirling.newberry@xigenics.net wrote:
On Dec 25, 2004, at 11:01 AM, John Lee wrote:
Stirling Newberry wrote:
It's a trade off between reader confusiong, from dialectical inconsistencies, and editor confusion. Of these two, the latter is easier to solve, because educating editors is within the reach of Wikimedia, where as changing readers is not. This syntax is no more difficult than the use of macros, image links and considerably less complex than tables.
The problem is that in English it will be greatly confusing and irritatingly convoluted. Even simple words will have to be written using the syntax (i.e. centre or center? Billion or millard? Flavour or flavor?). Imagine editing an article with this sort of thing. I don't think the trade-off is worth it, at least for different English "dialects".
Less work than editting mathematical equations or tables by far. And less work that reorganizing and moving pages.
As for othographic dialect changes this is not an objection: there is already a tradition:simplified substitution that does a similar level of translation. The process would be to have a "reverse bot" which would find where users have made orthological dialect differences, substitute them for the base dialect word, and then put them in machine translation.
It's not conceptually difficult, whether it is what people want to do is another, but it is certain "worth it" for readability and consistency.
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
On Dec 25, 2004, at 5:57 PM, Mark Williamson wrote:
Stirling, I have a minor question here,
With this "conversion", do you mean conversion of terms between different dialects of Cantonese?
I believe that wikimedia should make this capability available, and then let the communities decide how it is to be used within their own context. It would allow editors in Cantonese to decide to edit in the standard written Chinese wikipedia, but include Cantonese as an enrichment, it would also allow dialectical differences within a Cantonese wikipedia should one be established. It doesn't mandate either solution.
I agree that this is a good idea, and in certain cases it could be done automatically for all occurances of a word (that's how it works on zh.wikipedia right now; you can do exclusions though if you want).
However, conversion into Cantonese from BH ("Baihua", the current written standard used for Chinese based on the Mandarin speech of the Beijing area) has a few more difficulties than just that.
For example:
K'öi4 pei3 sa:m1-pun3 sü1 ngo4. = Ta1 gei3 wo3 san1ben3 shu1. ("He gave me three books.")
The Cantonese is "he give three-COUNTER book me". The Mandarin is "he give me three-COUNTER book." A basic difference in word order.
Ngo4 höi5 ka:i1 ma:i4 ye4 sin1. -> Wo3 xian1 shang4 jie1 mai3 dong1xi. ("I'm going to the market to buy some things beforehand.")
The Cantonese is "I go market buy things before". The Mandarin is "I before go market buy things." (again, the different word order)
K'öi4 kou1-kwo5 ngo4. -> Ta1 bi3 wo3 gao1. ("He's taller than am I.")
The Cantonese is "he tall pass me". The Mandarin is "he compare me tall".
Kiu5 k'öi4 loi2. -> Ba3 ta1 jiao4 lai2. ("Ask him to come.")
The Cantonese is "call him come". The Mandarin is "take him call come".
Ngo4 höi5 Pak7king1. -> Wo3 shang4 Bei3jing1 qu4. ("I'm going to Beijing.")
The Cantonese is "I go Beijing". The Mandarin is "I up Beijing go."
M2 t'ai3-tak7-kin5 -> kan4-bu2-jian4 ("Can't see!")
The Cantonese is "not look can see". The Mandarin is "look not see".
Nei4 sik8 fa:n6 m2 sik8? -> Ni3 chi1 bu4 chi1 fan4? ("Do you eat rice?")
The Cantonese is "you eat rice not eat". The Mandarin is "you eat not eat rice".
----
As you can see these fundamental differences in the very base of the language make it impossible with present technology to automatically translate accurately between Cantonese and Mandarin.
Mark
On Sat, 25 Dec 2004 18:12:44 -0500, Stirling Newberry stirling.newberry@xigenics.net wrote:
On Dec 25, 2004, at 5:57 PM, Mark Williamson wrote:
Stirling, I have a minor question here,
With this "conversion", do you mean conversion of terms between different dialects of Cantonese?
I believe that wikimedia should make this capability available, and then let the communities decide how it is to be used within their own context. It would allow editors in Cantonese to decide to edit in the standard written Chinese wikipedia, but include Cantonese as an enrichment, it would also allow dialectical differences within a Cantonese wikipedia should one be established. It doesn't mandate either solution.
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
On Dec 25, 2004, at 6:47 PM, Mark Williamson wrote:
I agree that this is a good idea, and in certain cases it could be done automatically for all occurances of a word (that's how it works on zh.wikipedia right now; you can do exclusions though if you want).
For example:
K'öi4 pei3 sa:m1-pun3 sü1 ngo4. = Ta1 gei3 wo3 san1ben3 shu1. ("He gave me three books.")
Which could be handled by bracketing: Ta1 gei3 [[dialect: wo3 | gd: ]] san1ben3 shu1 [[dialect: | gd: ngo4]].
Which would probably be macro'd since it is common. {{wo3}} and {{ngo4}} would do it as a pair.
The problem is is that language isn't 100% predictable.
If it were, people would've used your solution before to translate even between languages that aren't related, for example English and Guarani.
The thing here is that when written in Chinese characters, wo3 and ngo4 use the same character. The issue isn't meant to be the different sounds, it's the different grammars.
A potential problem with manual tagging is that Mandarin monolinguals would not know where to put these in their articles, and to have a squad of people go around and tag all their text for them is rediculous. Also, articles would look rediculous when edited because there is the same issue for other Sinitic languages (although at the moment, it's my opinion they don't really need separate Wikipedias except perhaps Hakka and probably Dungan), and there would be spare tags hanging out all over the place.
When there is a simple difference of terminology and perhaps spelling (ie International vs American English), your idea is very practical, but when there are huge syntactical differences as there are between Mandarin and Cantonese, it's not practical at all unfortunately.
Mark
On Sat, 25 Dec 2004 18:56:29 -0500, Stirling Newberry stirling.newberry@xigenics.net wrote:
On Dec 25, 2004, at 6:47 PM, Mark Williamson wrote:
I agree that this is a good idea, and in certain cases it could be done automatically for all occurances of a word (that's how it works on zh.wikipedia right now; you can do exclusions though if you want).
For example:
K'öi4 pei3 sa:m1-pun3 sü1 ngo4. = Ta1 gei3 wo3 san1ben3 shu1. ("He gave me three books.")
Which could be handled by bracketing: Ta1 gei3 [[dialect: wo3 | gd: ]] san1ben3 shu1 [[dialect: | gd: ngo4]].
Which would probably be macro'd since it is common. {{wo3}} and {{ngo4}} would do it as a pair.
You are saying that we will be able to mark linguistic patterns. I also gave other objections in that message, did you read any of them?
Marking linguistic patterns has been tried before. It takes a lot of effort and doesn't produce satisfactory results. And I believe the word is "dialectal".
Why have one Wikipedia and write two sentences separately tagged in the same article? That makes no sense. It's more trouble for everybody.
Mark
On Sat, 25 Dec 2004 19:29:32 -0500, Stirling Newberry stirling.newberry@xigenics.net wrote:
On Dec 25, 2004, at 7:16 PM, Mark Williamson wrote:
The problem is is that language isn't 100% predictable.
That's not an objection to providing the ability for people to mark dialectical material.
On Dec 26, 2004, at 1:26 AM, Mark Williamson wrote:
You are saying that we will be able to mark linguistic patterns. I also gave other objections in that message, did you read any of them?
Marking linguistic patterns has been tried before. It takes a lot of effort and doesn't produce satisfactory results. And I believe the word is "dialectal".
Why have one Wikipedia and write two sentences separately tagged in the same article? That makes no sense. It's more trouble for everybody.
Lot's of sensible things make no sense to you Mark, so I am not surprised.
More trouble? Less trouble than forking. And less trouble for readers who will be presented with information which is consistent in spelling and diction to their dialect.
And it is the readers who would find this the most useful, since it would make wikis with as large an editor base as possible.
As I said before, tagging of terms is one thing, but when you get to changing around grammar and word order, that is full-fledged machine translation and it won't work, no matter how many times you attack me personally.
Mark
On Sun, 26 Dec 2004 10:08:17 -0500, Stirling Newberry stirling.newberry@xigenics.net wrote:
On Dec 26, 2004, at 1:26 AM, Mark Williamson wrote:
You are saying that we will be able to mark linguistic patterns. I also gave other objections in that message, did you read any of them?
Marking linguistic patterns has been tried before. It takes a lot of effort and doesn't produce satisfactory results. And I believe the word is "dialectal".
Why have one Wikipedia and write two sentences separately tagged in the same article? That makes no sense. It's more trouble for everybody.
Lot's of sensible things make no sense to you Mark, so I am not surprised.
More trouble? Less trouble than forking. And less trouble for readers who will be presented with information which is consistent in spelling and diction to their dialect.
And it is the readers who would find this the most useful, since it would make wikis with as large an editor base as possible.
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
On Dec 26, 2004, at 2:47 PM, Mark Williamson wrote:
As I said before, tagging of terms is one thing, but when you get to changing around grammar and word order, that is full-fledged machine translation and it won't work, no matter how many times you attack me personally.
Mark
You made your credibility an issue, and therefore my questioning of your credibility is appropriate. As for "machine translation" being unworkable - you should really catch up with the developments in the field of machine translation.
Machine translation is unworkable to the degree that it is still not reasonable to use it to provide multilingual content and expect it to be reasonably correct.
If you believe otherwise, then where's your proposal to integrate all language Wikipedias using these recent advances in MT?
I may be a member of teh UNDL foundation, but even such dream systems as UNL only claim to be accurate 99% of the time (and so far, UNL has its fair share of problems), and even when they are accurate the things they produce often sound unnatural or awkward.
Mark
On Sun, 26 Dec 2004 15:10:33 -0500, Stirling Newberry stirling.newberry@xigenics.net wrote:
On Dec 26, 2004, at 2:47 PM, Mark Williamson wrote:
As I said before, tagging of terms is one thing, but when you get to changing around grammar and word order, that is full-fledged machine translation and it won't work, no matter how many times you attack me personally.
Mark
You made your credibility an issue, and therefore my questioning of your credibility is appropriate. As for "machine translation" being unworkable - you should really catch up with the developments in the field of machine translation.
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
On Dec 26, 2004, at 3:53 PM, Mark Williamson wrote:
Machine translation is unworkable to the degree that it is still not reasonable to use it to provide multilingual content and expect it to be reasonably correct.
Nothing in your rant is reponsive to the proposal, and much of it is inaccurate, even given the strawman that it is attacking.
Again, your proposal is nothing short of machine translation because it includes using a machine to convert between different syntaxes, grammars, vocabularies, etc. Apparently you already realise this and think there's nothing wrong with that.
But the fact remains that machine translation is not as reliable as you say. If you have some sort of working model, then perhaps people will be willing to trust you, but I checked with some Wikipedians (both those I agree with most of the time and those I usually disagree with) and the general consensus is that in this proposal you're off your rocker.
Mark
On Sun, 26 Dec 2004 16:00:49 -0500, Stirling Newberry stirling.newberry@xigenics.net wrote:
On Dec 26, 2004, at 3:53 PM, Mark Williamson wrote:
Machine translation is unworkable to the degree that it is still not reasonable to use it to provide multilingual content and expect it to be reasonably correct.
Nothing in your rant is reponsive to the proposal, and much of it is inaccurate, even given the strawman that it is attacking.
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
On Dec 26, 2004, at 4:26 PM, Mark Williamson wrote:
Again, your proposal is nothing short of machine translation
I would appreciate it if you stop lying about what is under discussion, making unsupported and inaccurate ex-cathedra statements and mischaracterizing what has been proposed.
"> As I said before, tagging of terms is one thing, but when you get to
changing around grammar and word order, that is full-fledged machine translation and it won't work, no matter how many times you attack me personally.
Mark
You made your credibility an issue, and therefore my questioning of your credibility is appropriate. As for "machine translation" being unworkable - you should really catch up with the developments in the field of machine translation."
Converting grammar and syntax between two different "systems" of grammar and syntax is machine translation. How does your proposal differ?
Mark
Stirling Newberry wrote:
On Dec 25, 2004, at 7:16 PM, Mark Williamson wrote:
The problem is is that language isn't 100% predictable.
That's not an objection to providing the ability for people to mark dialectical material.
Hmm! Dialectic (adj.) = relating to dialectics Diatectic (noun) = "the art of investigating or discussing the truth of opinions" etc. (per OED) Dialectical (adj.) = relating to logical discussion, etc.
The adjective that derives from "dialect" is "dialectal"
Ec
On Dec 26, 2004, at 2:55 PM, Ray Saintonge wrote:
Stirling Newberry wrote:
On Dec 25, 2004, at 7:16 PM, Mark Williamson wrote:
The problem is is that language isn't 100% predictable.
That's not an objection to providing the ability for people to mark dialectical material.
Hmm! Dialectic (adj.) = relating to dialectics Diatectic (noun) = "the art of investigating or discussing the truth of opinions" etc. (per OED) Dialectical (adj.) = relating to logical discussion, etc.
The adjective that derives from "dialect" is "dialectal"
Ec
Since people are down to mere proofing of the proposal, clearly there are no further objections to it.
Stirling Newberry wrote:
On Dec 26, 2004, at 2:55 PM, Ray Saintonge wrote:
Stirling Newberry wrote:
On Dec 25, 2004, at 7:16 PM, Mark Williamson wrote:
The problem is is that language isn't 100% predictable.
That's not an objection to providing the ability for people to mark dialectical material.
Hmm! Dialectic (adj.) = relating to dialectics Diatectic (noun) = "the art of investigating or discussing the truth of opinions" etc. (per OED) Dialectical (adj.) = relating to logical discussion, etc.
The adjective that derives from "dialect" is "dialectal"
Since people are down to mere proofing of the proposal, clearly there are no further objections to it.
This is a non-sequitur. I was just translating your comment so that English speakers would understand it. ;-)
Ec
Stirling Newberry wrote:
On Dec 26, 2004, at 6:04 PM, Ray Saintonge wrote:
This is a non-sequitur. I was just translating your comment so that English speakers would understand it. ;-)
Ec
Oh it was a spelling flame. Forgive me, I assumed good faith. My error.
Can we not snap at each other please? :-)
John Lee ([[en:User:Johnleemk]])
John Lee wrote:
Stirling Newberry wrote:
It's a trade off between reader confusiong, from dialectical inconsistencies, and editor confusion. Of these two, the latter is easier to solve, because educating editors is within the reach of Wikimedia, where as changing readers is not. This syntax is no more difficult than the use of macros, image links and considerably less complex than tables.
The problem is that in English it will be greatly confusing and irritatingly convoluted. Even simple words will have to be written using the syntax (i.e. centre or center? Billion or millard? Flavour or flavor?). Imagine editing an article with this sort of thing. I don't think the trade-off is worth it, at least for different English "dialects".
Though I still do not support the idea in English, if the Chinese language wikipedians want it it should be their choice.
Ec
What I think is this: if there are people who want to write it, and people who want to read it, no matter the number on either side (as long as it's more than 2 or 3 people total), it's rediculous to send down an iron fence in front of them and say "Andrew Lih and Jiaqing Bao do not believe you should be able to build a separate Wikipedia for this language. Your request is denied."
Similar things to what you said can be said for many minority languages, including for example Sicilian: Formal writing is usually done in Italian, but that doesn't mean Sicilian doesn't exist. Basque speakers often write in Spanish, and Luxembourgish speakers often write in German or French, but that doesn't mean their languages don't exist or that there will be no audience for these Wikipedias.
The problem of audience and readership is a problem for the individual Wikipedias and not the Wikimedia organization, as long as there is the "potential audience" of people who can understand the language when spoken or written (again, there are technologies so illiterates can take advantage of this technology).
You may see zh-min-nan as having "withered", but it's still growing. It would probably help if it had some instructions for the non-initiated on how to read peh-oe-ji, but nonetheless it is still growing, even if it's very slowly.
If some people say "we want a separate Wikipedia for our mother language" and another group says "we don't need a separate Wikipedia, let's use the old one" but the first group says again "we still want this separate Wikipedia, the current solution is not sufficient!", it is inappropriate for the first group to be quashed by the second simply because of a majority or minority.
We have some separate Wikipedias where we'd undoubtedly be able to get by with a unified Wikipedia, but due to issues of nationalism, and since it's never been done before, we haven't tried. But in the opposite direction we have had a couple of similar issues. The traditional vs simplified issue is thankfully now resolved (with the exception of relatively minor outstanding issues), we have a separate Wikipedia for Nynorsk, etc etc.
You talk about people adapting both ways. We aren't talking about traditional vs simplified anymore.
For some time now, Mandarin has totally dominated other Sinitic varieties and in some places there is the stereotype that a Mandarin speaker expects everybody to speak Mandarin, and if they don't, they must be daft (not that everybody fits this stereotype). Accommodations for Cantonese, Minnan, Hakka, Wu, Minbei, Gan, etc. are rarely made, and when they are it's usually in minor local issues, and nowadays it seems it mostly happens in Hong Kong, Macao, and Taiwan.
If I write a section of an article in colloquial Cantonese, and place it on zh:, what will you do? I think there is a good chance you will either remove it as "nonsense" or change it to baihua saying that you are "fixing it" - if this weren't the case, zh: would be a very muddled Wikipedia compared to what it is today.
As Stirling Newberry noted, there is a fairly recent phenomenon of material emerging in "CWY" (colloquial cantonese). Similar signs have been seen from other Sinitic languages (Haishanghua was published in colloquial Wu an eternity ago, but other than that Wu hasn't exactly had a blossoming separate literature; Hakka is starting to emerge as a separate and acceptable variety in Taiwan), but right now I think that by far the most pronounced "linguistic rebellion" (ie, assertion of linguistic independence) is for Cantonese.
Also, a Cantonese Wikipedia would presumably use hanzi rather than romanization, and thus would probably attract a larger crowd than zh-min-nan does (as Bao noted earlier, the fact that it's written in romanization is a turn-off for a lot of people).
The Cantonese dialect does have unique colorful phrases and a different linguistic culture that manifests itself in Cantopop, film and cartoons. Some would seem foreign to "Mandarin" speakers. It would be great to have these Cantonese-isms captured in some way that could be done in a combined ZH Wikipedia.
There is an emerging Cantonese literature, as Stirling Newberry noted.
Mark
Hello,
Mark Williamson wrote:
What I think is this: if there are people who want to write it, and people who want to read it, no matter the number on either side (as long as it's more than 2 or 3 people total), it's rediculous to send
I think someone was talking about moving this conversation somewhere else? I was vacationing in mainland China and didn't catch where this was moved to? Would some kind soul please inform me? Thanks.
varieties and in some places there is the stereotype that a Mandarin speaker expects everybody to speak Mandarin, and if they don't, they must be daft (not that everybody fits this stereotype). Accommodations for Cantonese, Minnan, Hakka, Wu, Minbei, Gan, etc. are rarely made, and when they are it's usually in minor local issues, and nowadays it seems it mostly happens in Hong Kong, Macao, and Taiwan.
Actually, the attitude in Hong Kong has always been that you must be daft if you can't speak Cantonese with the perfect Hong Kong accent. It's only after 1997 that the various public transportation, etc. have Mandarin announcements. The "official" Hong Kong stance is two written languages and three spoken languages. Most people ended up writing bad English & somewhat sufficient Chinese and speaking perfect Cantonese, somewhat okay English, and lousy Mandarin.
Since I've been going to Guangzhou a lot lately, I've noticed that there is a big difference between the attitude about Cantonese in Guangzhou vs. Hong Kong. And in Macao, even the Macanese speaks great Cantonese, though they all speak Portuguese at home.
As to the written system, Hong Kong has always used traditional Chinese and I hope that never changes. I can read simplified Chinese but I think it's the ugliest thing ever. And it's not like it helped the literacy rate or anything, since both Taiwan and Hong Kong have a higher literacy rate than mainland China. But I'm little old me and I'm not stupid enough to go against the whole Chinese government.
As Stirling Newberry noted, there is a fairly recent phenomenon of material emerging in "CWY" (colloquial cantonese). Similar signs have
Depends entirely on what you think of as recent, as plenty of popular HK fiction in the 80s are written in Cantonese. But to the older generation, even if they are Cantonese speakers themselves, baihua is simply how you write. It's not just prejudice against Cantonese, but also prejudice against mass culture/pop culture.
little Alex
I think someone was talking about moving this conversation somewhere else? I was vacationing in mainland China and didn't catch where this was moved to? Would some kind soul please inform me? Thanks.
There was a discussion about the revival of intlwiki-l, but it lost steam, so currently the discussion is still going on here.
Actually, the attitude in Hong Kong has always been that you must be daft if you can't speak Cantonese with the perfect Hong Kong accent. It's only after 1997 that the various public transportation, etc. have Mandarin announcements. The "official" Hong Kong stance is two written languages and three spoken languages. Most people ended up writing bad English & somewhat sufficient Chinese and speaking perfect Cantonese, somewhat okay English, and lousy Mandarin.
Right - what I meant is that this happens in Hong Kong, but never on the mainland.
Since I've been going to Guangzhou a lot lately, I've noticed that there is a big difference between the attitude about Cantonese in Guangzhou vs. Hong Kong. And in Macao, even the Macanese speaks great Cantonese, though they all speak Portuguese at home.
I'm curious about linguistic issues in Macao - do they share the Cantonese popular culture of Hong Kong?
As to the written system, Hong Kong has always used traditional Chinese and I hope that never changes. I can read simplified Chinese but I think it's the ugliest thing ever. And it's not like it helped the literacy rate or anything, since both Taiwan and Hong Kong have a higher literacy rate than mainland China. But I'm little old me and I'm not stupid enough to go against the whole Chinese government.
I agree here. Stylistically, even on the mainland all calligraphers and the like use Traditional, though Simplified does have some characters which were previously limited mostly to calligraphy (to me, they don't look very good in print). But unfortunately, when it comes to things like the UN, "marginal" Chinese regions using Traditional are trumped by the mainland and on an international level, people seem to use Simplified (I personally use Traditional for such purposes). If I recall correctly there was a poet who talked about the beauty of Chinese characters, comparing some to plants and animals. People have talked about this being lost if hanzi were ever to be discarded in favour of another system (pinyin, bopomofo), but I think it's already diminished by the "simplification" of Chinese characters. (really meaning reduction of strokes and in some cases merging of homophones)
Depends entirely on what you think of as recent, as plenty of popular HK fiction in the 80s are written in Cantonese. But to the older generation, even if they are Cantonese speakers themselves, baihua is simply how you write. It's not just prejudice against Cantonese, but also prejudice against mass culture/pop culture.
What I meant by "relatively recent" was in the last few decades instead of the last few centuries.
Best Mark
Jimmy (Jimbo) Wales ti 2004/12/24 EP 05:17 sia-kong:
André Müller wrote:
I completely agree with Alex Kwan. Having a Wikipedia in both Mandarin/simplified Chinese and Cantonese/traditional Chinese seems redundant to me as well. One could perhaps compare it with setting up extra Wikipedias for British and American English (I doubt that the differences are much bigger; except for the trad/simp issue).
"Redundancy", depending on how it's conceptualized, may well be a valid point. But certainly the linguistic distance between Cantonese and Mandarin is by no means insignificant, as evident by the fact that many quite basic, everday words are non-cognates. On the other hand, British and American English dialects share virtually all basic words ("man", "woman", names of body parts, etc.). There are also, to a lesser extent, grammatical differences.
That is not to say that Cantonese has not developed a tradition of _formal_ writing strongly influenced by early 20th-century Mandarin movement. This is akin to educated English speakers of the past moulding their grammar after Latin ("don't split the infinitive") and preferring Latinate words to native ones. The use of Chinese character further constructs a sense of (formal) Cantonese as "merely Mandarin pronounced differently". At the same time, Cantonese has also developed a colloquial written tradition. While hardly as prestigious, it does serve sociolinguistic functions. Whether that includes writing encyclopedia articles is for the native speakers to decide.
I tend to agree.
Which part?
Node is well known as an activist for forking Chinese into multiple projects, and so his comments should be considered in that full context.
--Jimbo
Hello,
Mark Williamson wrote:
If you have a method you would write it in, then...
I see no problem here, but undoubtedly it will be cast in an extremely negative light by some of the users on zh: who will probably oppose it on the grounds that it will take oh-so-many valuable users away from the oh-so-poor zh.wikipedia.
*cough* I'm a Cantonese speaker from Hong Kong and I don't think we should. There's the issue that Mark already brought up, what method should we write it in: romanization has never been standardized and traditional Chinese but with Cantonese words is, well, redundant...
Heh, what I meant was that, if you write in Baihua, but just read it in Cantonese, it's redundant, but if you write it in *colloquial* Cantonese, but with characters, it wouldn't be redundant (but perhaps still a bit strange).
While for Minnan and Hakka the current trend seems to be to write them in "Peh-oe-ji", for Cantonese it seems to me like the trend is to use hanzi, but it includes some hanzi not used in writing baihua (this isn't a problem as Unicode supports almost all 'fangyanzi' from Cantonese, but not nessecarily from others).
While there is an official romanisation system for Cantonese used in Hong Kong, I was under the impression that it was only for place names and had no provisions for marking tones. Jyutping seems to be gaining in popularity, but then there is also Cantonese Pinyin, and Yale romanisation, and then of course IPA (but who uses IPA?).
Another huge problem here is the fact that some people who try to write colloquial Cantonese on the mainland use Simplified characters, as opposed to Traditional in Hong Kong and Macau.
Some people might argue that all people who speak Cantonese can read baihua, but the same argument applies to other populations as well, for example how many Basque monolinguals are there? But they have a Wikipedia because they have a right (well, they don't "have a right",
Well, just because we can doesn't mean we should. ;)
Nonono, that's not what I meant. I meant, if Cantonese speakers want one, people like Shizhao should just let them instead of opposing it so much.
Mark
wikipedia-l@lists.wikimedia.org