Re: [Wiktionary-l] Chinese, traditional and simplified

14 Sep 2004


      Andrew Dunbar wrote:
...
--- Gerard Meijssen gerardm@myrealbox.com wrote:
...
There is a big thing on the wikipedia-l about
writing up Chinese. One thing I gleaned from this
discussion is that zh-tw and zh-cn are used to 
indicate respectively traditional and simplified
Chinese. As it is relevant to wiktionary to have
both correct spellings, I propose to use these codes
as well as the zh code to indicate Chinese words.
I don't have references but I'm sure I've read this is
not a good idea. Because all countries actually do use
both scripts sometimes, because sometimes the
countries
use entirely different words for the same thing, so it
doesn't always even come down to a choice of
character.
I think we'd be overriding a country code as a script
code when what we really need is a script code. That
would not be ambiguous. Something like zh-trad and
zh-simp. Or zh-trd and zh-smp.
I think I've seen a page on Microsoft's site which
does it this way somewhere but I doubt they really
use it. They do have language numbers to take into
account the different scripts though, including the
Serbians below.
In a wikipedia context, it is not a good idea as you want to bring 
people together. In a wiktionary context things are imho different as we 
do provide all (correct) words in all languages. In English, many words 
are spelled differently depending on it being en-us or en-uk or en-aus 
etc. The meaning of a word may be subtly different as well, so it is not 
always synonyms that we are talking about. With different scripts in one 
language you have words in a different script that are synonymous.
The pronounciation is also often different depending on where you come 
form. Patatoe, router etc
In a wiktionary you want to define the words and make it plain where the 
word comes from. All English variants can understand each other. It is 
up to wiktionary to allow for these differences. Consequently it is not 
only about script I realise. As Wikipedia already has zh-tw and zh-cn as 
codes, using them within wiktionary as well is reasonable. For Serbian, 
sr-cyr and sr-lat makes sense to me.
What I am not sure about is, how do we indicate the word as such; 
{{-xx-}} indicates a word in a language. I really want to keep it that 
way. Does following it up with {{xx-xx}} to indicate the relevant subset 
(characterset or regionality) as reasonable ?
Thanks,
    GerardM
...
...
I hope someone has a good suggestion for Serbian,
cyrillic and alphabetic.
Umm Cyrillic is still alphabetic. I think you meant
Cyrillic and Latin. How about sr-cyr and sr-lat?
...
There are more language that are written in
different charactersets. I am looking forward to
suggestions.
The least obscure I can think of is Punjabi which is
written in Gurmukhi, its own indic script; Shahmukhi,
a derivation from the Urdu script which is itself a
derivation of the Arabic script; and finally in Deva-
nagari, the most common script in India. - But this is
already quite obscure. Many former Soviet Republics
have 2 or 3 scripts as well.
Andrew Dunbar (hippietrail)
...
Thanks,
   GerardM
_______________________________________________
Wiktionary-l mailing list
Wiktionary-l@Wikipedia.org

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Wiktionary-l] Chinese, traditional and simplified