Hoi, For some of the newly created wikipedias there are reasons why I would not create them as languages in WiktionaryZ in this manner They are:
http://roa-tara.wikipedia.org/ Tarantino http://cbk-zam.wikipedia.org/ Zamboanga Chavacano http://zh-classical.wikipedia.org/ Classical Chinese http://cu.wikipedia.org/ Old Church Slavonic http://ru-sib.wikipedia.org/ Siberian/Nort Russian
- Tarantino is supposed to use the roa code. This code does not signify anything but that it is a Romance language. The code does specify what is included in the code and Tarantino is not one of them. - In ISO-639-3 cbk is the code for Chavacano. Zamboangueño is an alternative name.. - For Classical Chinese there is no specificity as what is meant by this. This is also easy to explain as the zh (zho) code itself is depreciated in the ISO-639-3 because there are some 10 languages that are included in this code. - I might include the Old Church Slavonic as chu. It is used as liturgical language of various Orthodox and Byzantine Catholic churches. It is considered extinct. - The ru-sib is a created language. It is a bad idea to create a code as if it is a dialect of the Russian language when it is not.
Thanks, GerardM
- Tarantino is supposed to use the roa code. This code does not
signify anything but that it is a Romance language. The code does specify what is included in the code and Tarantino is not one of them.
What code, then, would you propose?
- In ISO-639-3 cbk is the code for Chavacano. Zamboangueño is an
alternative name..
No, there are several different varieties of Chavacano. Zamboangueño is the most widely-spoken among them, in the city of Zamboanga. This is mentioned in the proposal.
- For Classical Chinese there is no specificity as what is meant by
this. This is also easy to explain as the zh (zho) code itself is depreciated in the ISO-639-3 because there are some 10 languages that are included in this code.
Again, what code would you recommend instead, and why didn't you mention it at the proposal stage?
- I might include the Old Church Slavonic as chu. It is used as
liturgical language of various Orthodox and Byzantine Catholic churches. It is considered extinct.
Not sure what's wrong with cu. Yes, it is an "old" code, but it is shorter and easier to type.
- The ru-sib is a created language. It is a bad idea to create a code
as if it is a dialect of the Russian language when it is not.
Not according to requester, it's not. Title of Wikipedia is North Russian WP.
Mark
GerardM wrote:
Hoi, For some of the newly created wikipedias there are reasons why I would not create them as languages in WiktionaryZ in this manner They are:
http://roa-tara.wikipedia.org/ Tarantino http://cbk-zam.wikipedia.org/ Zamboanga Chavacano http://zh-classical.wikipedia.org/ Classical Chinese http://cu.wikipedia.org/ Old Church Slavonic http://ru-sib.wikipedia.org/ Siberian/Nort Russian
- Tarantino is supposed to use the roa code. This code does not
signify anything but that it is a Romance language. The code does specify what is included in the code and Tarantino is not one of them.
- In ISO-639-3 cbk is the code for Chavacano. Zamboangueño is an
alternative name..
- For Classical Chinese there is no specificity as what is meant by
this. This is also easy to explain as the zh (zho) code itself is depreciated in the ISO-639-3 because there are some 10 languages that are included in this code.
- I might include the Old Church Slavonic as chu. It is used as
liturgical language of various Orthodox and Byzantine Catholic churches. It is considered extinct.
- The ru-sib is a created language. It is a bad idea to create a code
as if it is a dialect of the Russian language when it is not.
Thanks, GerardM
Gerard, please note that Wikipedia hasn't standardized on ISO 639-2 like Wiktionary has. Otherwise, you'd be using "eng.wikipedia.org" instead of "en.wikipedia.org". Maybe someday; I still remember going to "en2.wikipedia.org" way back when, so the URLs certainly aren't permanent.
Minh Nguyen wrote:
Gerard, please note that Wikipedia hasn't standardized on ISO 639-2 like Wiktionary has. Otherwise, you'd be using "eng.wikipedia.org" instead of "en.wikipedia.org".
With rare exceptions, we follow RFC 3066 for language codes; this is the standard used on the web for the Content-Language HTTP header, HTML 'lang' attribute, etc.
This uses, roughly, in priority order: * 2-letter ISO 639-1 code * 3-letter ISO 639-2 code * One of the specially registered 'i-*' codes * Non-standard 'x-*' codes
ISO 639-3 codes will most likely end up in there sooner or later.
Where no standard codes are available, we've sometimes ended up using something else; in this case we *should* be using the ISO 639-3 3-letter code. If there are exceptions that are definitely wrong, we probably want to fix that.
Maybe someday; I still remember going to "en2.wikipedia.org" way back when, so the URLs certainly aren't permanent.
That was crude load-balancing before we had something better set up. :) It's redirected if you've got really old URLs from that period.
-- brion vibber (brion @ pobox.com)
Brion Vibber wrote:
Minh Nguyen wrote:
Maybe someday; I still remember going to "en2.wikipedia.org" way back when, so the URLs certainly aren't permanent.
That was crude load-balancing before we had something better set up. :) It's redirected if you've got really old URLs from that period.
I think Minh Nguyen's point was that subdomains can and do change. If we make up a non-standard code for a language, and a standard code either already exists or is added later, then we can move the wiki fairly easily.
-- Tim Starling
Hoi, WiktionaryZ does not follow RFC 3066 because it is unusable for its purposes. The RFC 3066 is unusable for the languages that are not addressed by it. To link into the projects of the Wikimedia Foundation we can include the code used by the Wikimedia Foundation.
The practice of using codes that are made up and that are not distinguishable from ISO-639 codes is against the terms of use. The codes qaa to qtz are the codes where you are free to do what you want to do. Thanks, GerardM
Brion Vibber wrote:
Minh Nguyen wrote:
Gerard, please note that Wikipedia hasn't standardized on ISO 639-2 like Wiktionary has. Otherwise, you'd be using "eng.wikipedia.org" instead of "en.wikipedia.org".
With rare exceptions, we follow RFC 3066 for language codes; this is the standard used on the web for the Content-Language HTTP header, HTML 'lang' attribute, etc.
This uses, roughly, in priority order:
- 2-letter ISO 639-1 code
- 3-letter ISO 639-2 code
- One of the specially registered 'i-*' codes
- Non-standard 'x-*' codes
ISO 639-3 codes will most likely end up in there sooner or later.
Where no standard codes are available, we've sometimes ended up using something else; in this case we *should* be using the ISO 639-3 3-letter code. If there are exceptions that are definitely wrong, we probably want to fix that.
Maybe someday; I still remember going to "en2.wikipedia.org" way back when, so the URLs certainly aren't permanent.
That was crude load-balancing before we had something better set up. :) It's redirected if you've got really old URLs from that period.
-- brion vibber (brion @ pobox.com)
Gerard Meijssen <gerard.meijssen@...> writes:
Hoi, WiktionaryZ does not follow RFC 3066 because it is unusable for its purposes. The RFC 3066 is unusable for the languages that are not addressed by it. To link into the projects of the Wikimedia Foundation we can include the code used by the Wikimedia Foundation.
The practice of using codes that are made up and that are not distinguishable from ISO-639 codes is against the terms of use. The codes qaa to qtz are the codes where you are free to do what you want to do. Thanks, GerardM
Hi GerardM,
Since the WiktionaryZ does not following RFC 3066, but no idea is the language codes that used suits for RFC 4646 or not. :S
Shinjiman
Hoi, The RFC 4646 provides many things that I will gladly follow particularly in the way it allows for extentions. I have discussed the issues that I have with Felix Sasaki (W3C) and Gerhard Budin (ISO) among others.
- Many recognised languages have to "fit" as a subtag of another language. This is for many people who think "politically" about languages not acceptable. The way it is implemented is also a travesty for people who know about languages. The motivation is to provide backwards compatibility even though ISO-639-2 was dismissed as useless and ISO-639-3 had to be created really quickly. - Some of the things not standardised yet in the ISO codes are worked on as new standards (eg dialects) the proposed codes are not known at this time and this creates a mess of its own. - Orthographies are not supported in the planned for extentions to ISO-639. This is acknowledged as an ommission.
Consequently, it is much better to make a clean break while still conforming to standards. The standards complicance existing in WiktionaryZ is better than the standards compliance of the current crop of Wikimedia language codes. They do not conform to a standard because it breaks standards in many places. Some of the language codes used are voted in with a total disregard of what would be valid vis a vis the terms of usage of the ISO-639 codes.
Let me repeat that WiktionaryZ does include a code for Wikimedia language codes; it makes sense to have backwards compatibility. The basis for inclusion at WiktionaryZ for languages at this time are the ISO-639-3 codes. When need be the ISO-639-3 codes are extended using RFC 4646 to provide guidelines on how to do this and we will and do ask people in standards organisations on how to solve issues that are outside what RFC 4646 provides answers for.
We do have a code currently indicated as "ISO-639-2" in our database. We could enter something there that would be compatible with the RFC 4646 once it is figured out what would be a valid code under this regime. Problematic is that many of the ISO-639-2 codes are depreciated in ISO-639-3 and, it would be a nice academic excercise to create these codes where I do not want to deal with the political fall out that this creates.
Thanks, GerardM
On 10/7/06, Shinjiman shinjiman@gmail.com wrote:
Gerard Meijssen <gerard.meijssen@...> writes:
Hoi, WiktionaryZ does not follow RFC 3066 because it is unusable for its purposes. The RFC 3066 is unusable for the languages that are not addressed by it. To link into the projects of the Wikimedia Foundation we can include the code used by the Wikimedia Foundation.
The practice of using codes that are made up and that are not distinguishable from ISO-639 codes is against the terms of use. The codes qaa to qtz are the codes where you are free to do what you want to
do.
Thanks, GerardM
Hi GerardM,
Since the WiktionaryZ does not following RFC 3066, but no idea is the language codes that used suits for RFC 4646 or not. :S
Shinjiman
Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
On Fri, Oct 06, 2006 at 11:10:24AM +0200, Gerard Meijssen wrote:
The practice of using codes that are made up and that are not distinguishable from ISO-639 codes is against the terms of use.
"Terms of Use"? Of *what?
Cheers, -- j
Hoi, The ISO-639 have terms of use.. Thanks, GerardM
On 10/8/06, Jay R. Ashworth jra@baylink.com wrote:
On Fri, Oct 06, 2006 at 11:10:24AM +0200, Gerard Meijssen wrote:
The practice of using codes that are made up and that are not distinguishable from ISO-639 codes is against the terms of use.
"Terms of Use"? Of *what?
Cheers,
-- j
Jay R. Ashworth jra@baylink.com Designer Baylink RFC 2100 Ashworth & Associates The Things I Think '87 e24 St Petersburg FL USA http://baylink.pitas.com +1 727 647 1274
"That's women for you; you divorce them, and 10 years later, they stop having sex with you." -- Jennifer Crusie;
_Fast_Women_ _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
On 10/8/06, GerardM gerard.meijssen@gmail.com wrote:
Hoi, The ISO-639 have terms of use..
"Terms of use" generally indicates some kind of binding agreement. It would be better to say that the practice is against standards or nonconformant or something.
Simetrical wrote:
On 10/8/06, GerardM gerard.meijssen@gmail.com wrote:
Hoi, The ISO-639 have terms of use..
"Terms of use" generally indicates some kind of binding agreement. It would be better to say that the practice is against standards or nonconformant or something.
Hoi, That would be the case when they were not called "terms of use" Thanks, GerardM
2006/10/8, GerardM gerard.meijssen@gmail.com:
Hoi, The ISO-639 have terms of use..
If that is the case, then isn't using it against our policies of only using open source?
Andre Engels wrote:
2006/10/8, GerardM gerard.meijssen@gmail.com:
Hoi, The ISO-639 have terms of use..
If that is the case, then isn't using it against our policies of only using open source?
Hoi, This is like saying: I do not want to use the kilo or the Ampere because you are not interpret in a way that you see fit. Open Souce is about SOFTWARE. But ISO-639 are about STANDARDS it is no use if it is not used as a standard ie as a given. Thanks, GerardM
2006/10/9, Gerard Meijssen gerard.meijssen@gmail.com:
If that is the case, then isn't using it against our policies of only using open source?
Hoi, This is like saying: I do not want to use the kilo or the Ampere because you are not interpret in a way that you see fit. Open Souce is about SOFTWARE. But ISO-639 are about STANDARDS it is no use if it is not used as a standard ie as a given.
We use free standards as well as free software. Standards that may not be used in some way are not free.
Hoi, The ISO standards are a bit more than just language standards. They are absolutely free to use them as is. You are not allowed to represent codes that you make up that confuse with what is in the standard. For ISO-639 this is absolutely reasonable given that there are over 7000 codes to start off with.
So if you refuse to use standards that you can not change start telling people in the city how many bunder your father has on his farm. You will find that many do not have an idea how wide an area that is.
Free is a relative thing. For some things it makes no sense at all to have them Free as in you can change it in any which way. The standard organisations have several ways in which they can be influenced. You can discuss with them about language issues and, that is what the language subcommittee of the WMF hopes to do. In the mean time, you have to realise that the ISO standards are an integral part of how the world wide web works. When you decide that you will not use these standards, I am afraid your computer will not be able to communicate with the Internet. Standards are standards for a reason.
Thanks, GerardM
On 10/10/06, Andre Engels andreengels@gmail.com wrote:
2006/10/9, Gerard Meijssen gerard.meijssen@gmail.com:
If that is the case, then isn't using it against our policies of only using open source?
Hoi, This is like saying: I do not want to use the kilo or the Ampere because you are not interpret in a way that you see fit. Open Souce is about SOFTWARE. But ISO-639 are about STANDARDS it is no use if it is not used as a standard ie as a given.
We use free standards as well as free software. Standards that may not be used in some way are not free.
-- Andre Engels, andreengels@gmail.com ICQ: 6260644 -- Skype: a_engels _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
2006/10/10, GerardM gerard.meijssen@gmail.com:
The ISO standards are a bit more than just language standards. They are absolutely free to use them as is. You are not allowed to represent codes that you make up that confuse with what is in the standard. For ISO-639 this is absolutely reasonable given that there are over 7000 codes to start off with.
Which in my opinion would be a reason to go to our own system rather than theirs.
So if you refuse to use standards that you can not change start telling people in the city how many bunder your father has on his farm. You will find that many do not have an idea how wide an area that is.
So what? If I and my friends agree on what a bunder is, we can use it. To the outside world I have to use something else. It's not like Wikipedia is trying to tell others what language codes to use...
Free is a relative thing. For some things it makes no sense at all to have them Free as in you can change it in any which way. The standard organisations have several ways in which they can be influenced. You can discuss with them about language issues and, that is what the language subcommittee of the WMF hopes to do. In the mean time, you have to realise that the ISO standards are an integral part of how the world wide web works. When you decide that you will not use these standards, I am afraid your computer will not be able to communicate with the Internet. Standards are standards for a reason.
Why that? Giving our languages another URL is not going to make them suddenly unfindable to half the world.
Hoi, Actually it says in the database what language is used. These languages are indeed the ISO codes. Google and other organisations will have a hard time to know what language is ment if we were to have our "own" codes. Indeed we would not be as popular if people did not find us on the search engines.
Another thing, I know that Brion will not allow for deviating from the standard more than we already do. Thanks, GerardM
On 10/10/06, Andre Engels andreengels@gmail.com wrote:
2006/10/10, GerardM gerard.meijssen@gmail.com:
The ISO standards are a bit more than just language standards. They are absolutely free to use them as is. You are not allowed to represent
codes
that you make up that confuse with what is in the standard. For ISO-639
this
is absolutely reasonable given that there are over 7000 codes to start
off
with.
Which in my opinion would be a reason to go to our own system rather than theirs.
So if you refuse to use standards that you can not change start telling people in the city how many bunder your father has on his farm. You will find that many do not have an idea how wide an area that is.
So what? If I and my friends agree on what a bunder is, we can use it. To the outside world I have to use something else. It's not like Wikipedia is trying to tell others what language codes to use...
Free is a relative thing. For some things it makes no sense at all to
have
them Free as in you can change it in any which way. The standard organisations have several ways in which they can be influenced. You can discuss with them about language issues and, that is what the language subcommittee of the WMF hopes to do. In the mean time, you have to
realise
that the ISO standards are an integral part of how the world wide web
works.
When you decide that you will not use these standards, I am afraid your computer will not be able to communicate with the Internet. Standards
are
standards for a reason.
Why that? Giving our languages another URL is not going to make them suddenly unfindable to half the world.
-- Andre Engels, andreengels@gmail.com ICQ: 6260644 -- Skype: a_engels _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Brion Vibber <brion@...> writes:
Minh Nguyen wrote:
Gerard, please note that Wikipedia hasn't standardized on ISO 639-2 like Wiktionary has. Otherwise, you'd be using "eng.wikipedia.org" instead of "en.wikipedia.org".
With rare exceptions, we follow RFC 3066 for language codes; this is the standard used on the web for the Content-Language HTTP header, HTML 'lang' attribute, etc.
This uses, roughly, in priority order:
- 2-letter ISO 639-1 code
- 3-letter ISO 639-2 code
- One of the specially registered 'i-*' codes
- Non-standard 'x-*' codes
ISO 639-3 codes will most likely end up in there sooner or later.
Where no standard codes are available, we've sometimes ended up using
something
else; in this case we *should* be using the ISO 639-3 3-letter code. If
there
are exceptions that are definitely wrong, we probably want to fix that.
Maybe someday; I still remember going to "en2.wikipedia.org" way back when, so the URLs certainly aren't permanent.
That was crude load-balancing before we had something better set up. :) It's redirected if you've got really old URLs from that period.
Hi all,
I think the RFC 3066 is deprecated, and replaced by RFC 4646 and RFC 4647. Some of existing language codes that using in the software will be changed, following by the RFC 4646. for example, 'zh-yue' becomes 'yue' and the tags like 'zh-hans' and 'zh-hant' becomed valid, according to the RFC 4646. If the language code is changed, not only the domain name are going to change, the name that used in the php files (MessagesXx.php, names.php, and several php files used in the extensions) should also be changed.
Shinjiman
wikitech-l@lists.wikimedia.org