To Mark Williamson:
By the way, since when am I trying to compare en/jp and tc/sc? I was merely responding to something somebody else said about SC and TC users "living in the same universe" or something.
I don't think I lose my point.
tc/sc users enjoy the same concept structure of the universe, but en/jp, en/tc or en/sc are not same. For example planet Venus in English is a term related to a goddess, but in both sc/tc planet Venus is related to the same things - gold and star. In one word, tc/sc is the same language. This is my point.
The tc/sc users not only enjoy the same grammar of language, but also most part of their knowledge systems. Let us not talk about Chinese native knowledge, such as Chinese history, Foreklore, but let us talk about mordern science. Terminologies of mordern science are introduced to China since Ming Dynasty hundreds of years ago, and increased vastly after 1900. The Chinese knowledge system evolve into their morder form just after the New Culture Movment around 1920. But the split of tc/sc is about at 1956, then the tc/sc enjoy the same backgroud of their knowledge systems.
From 1949 to 1980s tc/sc evolved independently for lack of communication,
then some new terminologies are different, such as in computer science. But after 1980s, the communication between tc/sc increased comparatively.
yuanml wrote:
To Mark Williamson:
By the way, since when am I trying to compare en/jp and tc/sc? I was merely responding to something somebody else said about SC and TC users "living in the same universe" or something.
I don't think I lose my point.
tc/sc users enjoy the same concept structure of the universe, but en/jp, en/tc or en/sc are not same. For example planet Venus in English is a term related to a goddess, but in both sc/tc planet Venus is related to the same things - gold and star. In one word, tc/sc is the same language. This is my point.
The tc/sc users not only enjoy the same grammar of language, but also most part of their knowledge systems. Let us not talk about Chinese native knowledge, such as Chinese history, Foreklore, but let us talk about mordern science. Terminologies of mordern science are introduced to China since Ming Dynasty hundreds of years ago, and increased vastly after 1900. The Chinese knowledge system evolve into their morder form just after the New Culture Movment around 1920. But the split of tc/sc is about at 1956, then the tc/sc enjoy the same backgroud of their knowledge systems.
From 1949 to 1980s tc/sc evolved independently for lack of communication,
then some new terminologies are different, such as in computer science. But after 1980s, the communication between tc/sc increased comparatively.
Disclaimer: I can't read Chinese, so I don't know whether this is similar to any of the current or proposed solutions, but I have read some of the literature on the subject. My apologies if I'm going over old territory.
The best analogy is (I think) the difference between en-us and en-gb: the differences are mostly "spelling" and idioms. Automatic conversion is entirely possible, but occasionally imperfect. However, it should be possible to paraphrase around these problems where they occur and produce a single text that can be displayed (and edited) in either language and converted to-and-fro.
Perhaps one way to do it would be as in this fictitious example: if I have a (say) simplified word that means "fish", but can be transformed to either (say) "FISH" or "STONE" in the traditional script. Suppose we auto-convert this '''into the Wiki source''' at edit time to markup like
[fish=FISH|STONE]
which would display as "fish" highlighted in some way when the page is rendered in simplified script to show there is a potential transliteration problem, and as [FISH|STONE] when rendered in traditional script.
Then it can be cleaned up in markup by writing:
[fish=FISH]
or similar markup, which will force the traditional rendering to the correct word, and remove the warning flag for simplified rendering, since there is now a one-to-one mapping. The same would apply for in reverse for ambiguous conversions in the opposite direction. With any luck, this could be entirely lexicon-driven, and would need no AI research, because we would be find all pages containing ambiguities automatically, and then harness the copyediting skills of Wikipedians to find and disambiguate all the problematic text. We could even harness this when idioms or short phrases differ, to go:
[idiom in simplified=IDIOM IN TRADITIONAL]
-- Neil
Actually it is very different.
Simplified and Traditional is not just a difference in idioms and terminology, it is a difference between writing systems and even though the majority of characters have 1-to-1 correspondences, accurate conversion nessecitates something that can tell from context what traditional character is the proper equivalent for the simplified character used.
On Sat, 11 Sep 2004 12:13:05 +0100, Neil Harris usenet@tonal.clara.co.uk wrote:
yuanml wrote:
To Mark Williamson:
By the way, since when am I trying to compare en/jp and tc/sc? I was merely responding to something somebody else said about SC and TC users "living in the same universe" or something.
I don't think I lose my point.
tc/sc users enjoy the same concept structure of the universe, but en/jp, en/tc or en/sc are not same. For example planet Venus in English is a term related to a goddess, but in both sc/tc planet Venus is related to the same things - gold and star. In one word, tc/sc is the same language. This is my point.
The tc/sc users not only enjoy the same grammar of language, but also most part of their knowledge systems. Let us not talk about Chinese native knowledge, such as Chinese history, Foreklore, but let us talk about mordern science. Terminologies of mordern science are introduced to China since Ming Dynasty hundreds of years ago, and increased vastly after 1900. The Chinese knowledge system evolve into their morder form just after the New Culture Movment around 1920. But the split of tc/sc is about at 1956, then the tc/sc enjoy the same backgroud of their knowledge systems.
From 1949 to 1980s tc/sc evolved independently for lack of communication,
then some new terminologies are different, such as in computer science. But after 1980s, the communication between tc/sc increased comparatively.
Disclaimer: I can't read Chinese, so I don't know whether this is similar to any of the current or proposed solutions, but I have read some of the literature on the subject. My apologies if I'm going over old territory.
The best analogy is (I think) the difference between en-us and en-gb: the differences are mostly "spelling" and idioms. Automatic conversion is entirely possible, but occasionally imperfect. However, it should be possible to paraphrase around these problems where they occur and produce a single text that can be displayed (and edited) in either language and converted to-and-fro.
Perhaps one way to do it would be as in this fictitious example: if I have a (say) simplified word that means "fish", but can be transformed to either (say) "FISH" or "STONE" in the traditional script. Suppose we auto-convert this '''into the Wiki source''' at edit time to markup like
[fish=FISH|STONE]
which would display as "fish" highlighted in some way when the page is rendered in simplified script to show there is a potential transliteration problem, and as [FISH|STONE] when rendered in traditional script.
Then it can be cleaned up in markup by writing:
[fish=FISH]
or similar markup, which will force the traditional rendering to the correct word, and remove the warning flag for simplified rendering, since there is now a one-to-one mapping. The same would apply for in reverse for ambiguous conversions in the opposite direction. With any luck, this could be entirely lexicon-driven, and would need no AI research, because we would be find all pages containing ambiguities automatically, and then harness the copyediting skills of Wikipedians to find and disambiguate all the problematic text. We could even harness this when idioms or short phrases differ, to go:
[idiom in simplified=IDIOM IN TRADITIONAL]
-- Neil
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
I would be curious to know who started the simple english wikipedia. If it was discusssed/voted in the main english wikipedia...
and mostly why it was set... whose public it is addressed to...
Anthere
yuanml a écrit:
To Mark Williamson:
By the way, since when am I trying to compare en/jp and tc/sc? I was merely responding to something somebody else said about SC and TC users "living in the same universe" or something.
I don't think I lose my point.
tc/sc users enjoy the same concept structure of the universe, but en/jp, en/tc or en/sc are not same. For example planet Venus in English is a term related to a goddess, but in both sc/tc planet Venus is related to the same things - gold and star. In one word, tc/sc is the same language. This is my point.
The tc/sc users not only enjoy the same grammar of language, but also most part of their knowledge systems. Let us not talk about Chinese native knowledge, such as Chinese history, Foreklore, but let us talk about mordern science. Terminologies of mordern science are introduced to China since Ming Dynasty hundreds of years ago, and increased vastly after 1900. The Chinese knowledge system evolve into their morder form just after the New Culture Movment around 1920. But the split of tc/sc is about at 1956, then the tc/sc enjoy the same backgroud of their knowledge systems.
From 1949 to 1980s tc/sc evolved independently for lack of communication,
then some new terminologies are different, such as in computer science. But after 1980s, the communication between tc/sc increased comparatively.
Anthere wrote:
I would be curious to know who started the simple english wikipedia. If it was discusssed/voted in the main english wikipedia...
and mostly why it was set... whose public it is addressed to...
It is quite old; I rather doubt if there was a lot of discussion/debate/vote in the way we do things now. Someone asked for it, it was created, I have no further recollection.
I do know that the audience it is addressed to is "adult learners of English" or something of that nature. That is to say, it is not necessarily intended that the concepts be simple, only that the language be kept easy to read.
--Jimbo
wikipedia-l@lists.wikimedia.org