There have been quite a few requests for Wikipedias in new languages in the recent months.
All of them have been examined carefully and they were broadly discussed by the community.
Seven of them have qualified. They are considered promising and useful projects and a very clear majority of users is convinced that they will be a beneficial extension of the Wikipedia family.
However, before those babies can grow and prosper we need a specialist to assist their births.
In other words: I'd like to ask you to create the wikis specified below please.
* Nedersaksisch: nds-nl.wikipedia.org [Netherlands Low Saxon]
* Romani: rmy.wikipedia.org [Vlax Romany]
* Líguru: lij.wikipedia.org [Ligurian]
* emaitėka: bat-smg.wikipedia.org [Samogitian]
* Basa Banyumasan: map-bms.wikipedia.org [Banyumasan]
* Ripoarisch: grp.wikipedia.org [Ripuarian]
* Pennsylfaanisch-Deitsch: pdc.wikipedia.org [Pennsylvania German]
Further details can be found at http://meta.wikimedia.org/wiki/Approved_requests_for_new_languages
Thank you very much for your time and attention, also on behalf of our new fellow Wikipedians!
Arbeo
--------------------------------- Telefonieren Sie ohne weitere Kosten mit Ihren Freunden von PC zu PC! Jetzt Yahoo! Messenger installieren!
I would like to second this request.
Of these, Banyumasan and Samogitian have been waiting a particularly long time.
Mark
On 13/12/05, Arbeo M arbeo_m@yahoo.de wrote:
There have been quite a few requests for Wikipedias in new languages in the recent months.
All of them have been examined carefully and they were broadly discussed by the community.
Seven of them have qualified. They are considered promising and useful projects and a very clear majority of users is convinced that they will be a beneficial extension of the Wikipedia family.
However, before those babies can grow and prosper we need a specialist to assist their births.
In other words: I'd like to ask you to create the wikis specified below please.
Nedersaksisch: nds-nl.wikipedia.org [Netherlands Low Saxon]
Romani: rmy.wikipedia.org [Vlax Romany]
Líguru: lij.wikipedia.org [Ligurian]
Žemaitėška: bat-smg.wikipedia.org [Samogitian]
Basa Banyumasan: map-bms.wikipedia.org [Banyumasan]
Ripoarisch: grp.wikipedia.org [Ripuarian]
Pennsylfaanisch-Deitsch: pdc.wikipedia.org [Pennsylvania German]
Further details can be found at http://meta.wikimedia.org/wiki/Approved_requests_for_new_languages
Thank you very much for your time and attention, also on behalf of our new fellow Wikipedians!
Arbeo
Telefonieren Sie ohne weitere Kosten mit Ihren Freunden von PC zu PC! Jetzt Yahoo! Messenger installieren! _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
-- "Take away their language, destroy their souls." -- Joseph Stalin
Mark Williamson <node.ue@...> writes:
I would like to second this request.
Of these, Banyumasan and Samogitian have been waiting a particularly long time.
Mark
I also fully support this request. Most of these approved wikipedias have steadily growing test-wikis, and some are starting to look like successful wikipedias already. I hope they all can be created in the near future.
Hyunsung
Uhm, [offtopic]
I also fully support this request. Most of these approved wikipedias have steadily growing test-wikis, and some are starting to look like successful wikipedias already. I hope they all can be created in the near future.
Such requests may demotivate ;-) e.g. Samogitian 'language', in request is written to have '0.5 million speakers'. Have you ever heard of a language in Europe spoken by 0.5 million people, that has no media, literature, not used at all in schools (even in it's core region), etc? :) Currently it's a dialect, that reflects in sometimes weird pronouncations of words ;-)
If there're 3 guys, who can write it, they won't be creating a successful wikipedia. But I guess Wikipedia fits the purpose of extinct language museum?
Cheers, Domas
Such requests may demotivate ;-) e.g. Samogitian 'language', in request is written to have '0.5 million speakers'. Have you ever heard of a language in Europe spoken by 0.5 million people, that has no media, literature, not used at all in schools (even in it's core region), etc? :) Currently it's a dialect, that reflects in sometimes weird pronouncations of words ;-)
Yes, I have heard of such languages.
Mark
-- "Take away their language, destroy their souls." -- Joseph Stalin
Note: Some of these proposals propose using the three letter sil codes for the projects in question, this is higly unadvisable because there might be an ISO code issued that equals that code in the future.
Ævar, as far as I know, they're equal to the ISO/DIS 639-3 codes. Most ISO/DIS 639-3 codes are taken directly from SIL, unless there is a conflict with existing ISO standards.
While ISO/DIS 639-3 is still provisional, in all reasonable likelyhood there will be no major changes to existing codes.
Mark
On 15/12/05, Ævar Arnfjörð Bjarmason avarab@gmail.com wrote:
Note: Some of these proposals propose using the three letter sil codes for the projects in question, this is higly unadvisable because there might be an ISO code issued that equals that code in the future. _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
-- "Take away their language, destroy their souls." -- Joseph Stalin
Hoi, I was at a conference in Berlin this week. I had the honour to speak with some of the authors of the ISO-639 code. The work on ISO-639-3 has finished. They are now working on three new flavours of the code. ISO-639-4, ISO639-5 and ISO-639-6. The last will be to indicate dialects. There is nothing available to the public at this time. Thanks, GerardM
Mark Williamson wrote:
Ævar, as far as I know, they're equal to the ISO/DIS 639-3 codes. Most ISO/DIS 639-3 codes are taken directly from SIL, unless there is a conflict with existing ISO standards.
While ISO/DIS 639-3 is still provisional, in all reasonable likelyhood there will be no major changes to existing codes.
Mark
On 15/12/05, Ævar Arnfjörð Bjarmason avarab@gmail.com wrote:
Note: Some of these proposals propose using the three letter sil codes for the projects in question, this is higly unadvisable because there might be an ISO code issued that equals that code in the future
On 12/15/05, Mark Williamson node.ue@gmail.com wrote:
While ISO/DIS 639-3 is still provisional, in all reasonable likelyhood there will be no major changes to existing codes.
That still doesn't avoid the problem of
1. Us allocating a site under a SIL code or some draft ISO spec 2. ISO allocating a code that equals the SIL code for a different language or the ISO changing its mind in the final revision of its standard 3. Us having to deal with the mess
Which is why I think it would be a good idea to put languages that don't have *official* ISO codes or only have SIL codes under some name that defenetly won't conflict with future ISO allocations.
Ævar Arnfjörð Bjarmason wrote:
On 12/15/05, Mark Williamson node.ue@gmail.com wrote:
While ISO/DIS 639-3 is still provisional, in all reasonable likelyhood there will be no major changes to existing codes.
That still doesn't avoid the problem of
- Us allocating a site under a SIL code or some draft ISO spec
- ISO allocating a code that equals the SIL code for a different
language or the ISO changing its mind in the final revision of its standard 3. Us having to deal with the mess
Which is why I think it would be a good idea to put languages that don't have *official* ISO codes or only have SIL codes under some name that defenetly won't conflict with future ISO allocations.
Hoi, The ISO 639-3 codes are more less fixed. They will not change. In ISO-639-4 there will be a cleanup. The SIL codes are now superseded by the latest version of the ISO-639-3 codes. I will make sure that we will get the latest version of the ISO-639 codes. The SIL codes are all capitalised the ISO-639 codes are not. Therefore capitalisation is a sure method of preventing conflicts. If our software does not take this capitalisation into account, it is our code that is at fault.
Thanks, GerardM
Domas Mituzas wrote:
Therefore capitalisation is a sure method of preventing conflicts.
If our software does not take this capitalisation into account, it is our code that is at fault.
Was this... a... joke? :)
Since when domain names are case sensitive?
Domas
Hoi, It is not a joke. When we use SIL codes and intermingle them with ISO-639 codes, it is our responsibility that we ensure the right distinction. You cannot mix two standards and expect them to be correct blindly. It is our software that implements standards and consequently when yet another standard makes that we cannot do it implicitly we have to do it explicitly. Thanks, GerardM
"Gerard Meijssen" gerard.meijssen@gmail.com wrote in message news:43A41BF4.2090608@gmail.com...
Domas Mituzas wrote:
Therefore capitalisation is a sure method of preventing conflicts.
If our software does not take this capitalisation into account, it is our code that is at fault.
Was this... a... joke? :)
Since when domain names are case sensitive?
Domas
Hoi, It is not a joke. When we use SIL codes and intermingle them with ISO-639 codes, it is our responsibility that we ensure the right distinction. You cannot mix two standards and expect them to be correct blindly. It is our software that implements standards and consequently when yet another standard makes that we cannot do it implicitly we have to do it explicitly.
If someone else has already pointed this out, then sorry for duplicating the effort, but I'm afraid the "joke" is on you.
You are suggesting that we distinguish between "SIL" and "ISO" codes for new languages by using lowercase for the former and UPPERCASE for the latter.
These codes will be used to specify the sub-domains for the wikipedia concerned (e.g. ISOCODE.wikipedia.org, silcode.wikipedia.org).
Brace yourself, here comes the "joke": domains are CaSe-InSeNsItIvE by design, always have been, likely always will be.
Therefore EN.WIKIPEDIA.ORG == en.wikipedia.org ==EN.wikipedia.ORG == En.WiKiPeDiA.oRg and so on for any combination you could ever make up.
This a fundamental part of the Internet and has been since inception.
Nothing we could ever do within the Mediawiki code could ever do anything about it, explicitly or implicitly.
HTH HAND
Phil Boswell wrote:
"Gerard Meijssen" gerard.meijssen@gmail.com wrote in message news:43A41BF4.2090608@gmail.com...
Domas Mituzas wrote:
Therefore capitalisation is a sure method of preventing conflicts.
If our software does not take this capitalisation into account, it is our code that is at fault.
Was this... a... joke? :)
Since when domain names are case sensitive?
Domas
Hoi, It is not a joke. When we use SIL codes and intermingle them with ISO-639 codes, it is our responsibility that we ensure the right distinction. You cannot mix two standards and expect them to be correct blindly. It is our software that implements standards and consequently when yet another standard makes that we cannot do it implicitly we have to do it explicitly.
If someone else has already pointed this out, then sorry for duplicating the effort, but I'm afraid the "joke" is on you.
You are suggesting that we distinguish between "SIL" and "ISO" codes for new languages by using lowercase for the former and UPPERCASE for the latter.
These codes will be used to specify the sub-domains for the wikipedia concerned (e.g. ISOCODE.wikipedia.org, silcode.wikipedia.org).
Brace yourself, here comes the "joke": domains are CaSe-InSeNsItIvE by design, always have been, likely always will be.
Therefore EN.WIKIPEDIA.ORG == en.wikipedia.org ==EN.wikipedia.ORG == En.WiKiPeDiA.oRg and so on for any combination you could ever make up.
This a fundamental part of the Internet and has been since inception.
Nothing we could ever do within the Mediawiki code could ever do anything about it, explicitly or implicitly.
HTH HAND
Hoi, Actually you misunderstand me. When you mix standards and the difference between those standards disappears because of restraints that YOU are under, it means that what you do is not up to standard. Consequently it cannot be said that we implement the ISO-639 codes. When we adopt the ISO-639-3 codes we have the SIL information included. It is a custom to keep the two character ISO-639-1 codes as well. This is in my opinion the only way out of this mess. NB We have been using these codes already for over a year in Wiktionary.
Yes I know. URL's are case insensitive. We can implement ISO-639-3 and we should do this. NB there are even standard formats to indicate that a certain script is used as there are ISO codes for those too (ISO 15924).
So we either adopt a standard and do it right or we should not say that we adopt a standard. If you want an official statement as to the stability of the ISO-639-3, I think that I can get you that.
Thanks, GerardM
Gerard Meijssen wrote:
Hoi, Actually you misunderstand me. When you mix standards and the difference between those standards disappears because of restraints that YOU are under, it means that what you do is not up to standard. Consequently it cannot be said that we implement the ISO-639 codes. When we adopt the ISO-639-3 codes we have the SIL information included. It is a custom to keep the two character ISO-639-1 codes as well. This is in my opinion the only way out of this mess. NB We have been using these codes already for over a year in Wiktionary.
Yes I know. URL's are case insensitive. We can implement ISO-639-3 and we should do this. NB there are even standard formats to indicate that a certain script is used as there are ISO codes for those too (ISO 15924).
So we either adopt a standard and do it right or we should not say that we adopt a standard. If you want an official statement as to the stability of the ISO-639-3, I think that I can get you that.
One also needs to remember that ISO 639-3 still reserves a block of codes beginning with the letter "q" tha are user defined. These can be used for languages that are not included in the standard. The languages where that would be needed are not going to be the most popular or best known, and are unlikely to become big wikia. If at some future time thay are given a standard code conversion should not be a huge operation.
Ec
Ray Saintonge wrote:
Gerard Meijssen wrote:
Hoi, Actually you misunderstand me. When you mix standards and the difference between those standards disappears because of restraints that YOU are under, it means that what you do is not up to standard. Consequently it cannot be said that we implement the ISO-639 codes. When we adopt the ISO-639-3 codes we have the SIL information included. It is a custom to keep the two character ISO-639-1 codes as well. This is in my opinion the only way out of this mess. NB We have been using these codes already for over a year in Wiktionary.
Yes I know. URL's are case insensitive. We can implement ISO-639-3 and we should do this. NB there are even standard formats to indicate that a certain script is used as there are ISO codes for those too (ISO 15924).
So we either adopt a standard and do it right or we should not say that we adopt a standard. If you want an official statement as to the stability of the ISO-639-3, I think that I can get you that.
One also needs to remember that ISO 639-3 still reserves a block of codes beginning with the letter "q" tha are user defined. These can be used for languages that are not included in the standard. The languages where that would be needed are not going to be the most popular or best known, and are unlikely to become big wikia. If at some future time thay are given a standard code conversion should not be a huge operation.
Ec
Hoi, Many of the current Wiktionaries make use of the ISO-639-3 codes. Here there is no relevance as to the size of a prospective Wiki. We use the code to indicate that a word is in a given language. I have extended the data design of Ultimate Wiktionary to include room for a translation framework. The idea is that is will help us manage the translation the Mediawiki messages. Given that UW will be a Mediawiki application and that we want the localisation for the languages that make use of it (basically all of them) we need a way to manage the translations in a structured way..
Please have a look at http://meta.wikimedia.org/wiki/Ultimate_Wiktionary_data_design#Managing_tran... and give me your questions, your comments.
Thanks, GerardM
I don't see the problem here all have ISO codes who cares about capitalisation or not, ISO codes are not capitalised for as far as I know and most will certainly never change and will always stay that way.
Correct and official ISO codes: nds-nl; Nedersaksisch (personal reply from ISO code commission) rmy; Romani Vlax (official code; ethnologue) pdc; Pensylvanian German (official code; ethonologue) lij; Liguarian (official code; ethnologue. current proposal "roa-lij" or "lij")
Semi-official codes: vls; West-Vlaams (the term Flemish is used as collective term for Dutch dialects in Flandres, but is mostly associated with West-Flemish, so I've heard. "gem-wvl" could be an option aswell) ksh; Ripuarian (ksh; Keulsh, Ripuarian not listed on ethnologue)
Not listed on ethnologue, "dialect code" is used (gem-, bat-, map- etc): bat-smg; Samogitian (not listed on ethnologue) qbn or map-bms; Banyumassan (not listed on ethnologue)
Servien
2005/12/21, Gerard Meijssen gerard.meijssen@gmail.com:
Ray Saintonge wrote:
Gerard Meijssen wrote:
Hoi, Actually you misunderstand me. When you mix standards and the difference between those standards disappears because of restraints that YOU are under, it means that what you do is not up to standard. Consequently it cannot be said that we implement the ISO-639 codes. When we adopt the ISO-639-3 codes we have the SIL information included. It is a custom to keep the two character ISO-639-1 codes as well. This is in my opinion the only way out of this mess. NB We have been using these codes already for over a year in Wiktionary.
Yes I know. URL's are case insensitive. We can implement ISO-639-3 and we should do this. NB there are even standard formats to indicate that a certain script is used as there are ISO codes for those too (ISO 15924).
So we either adopt a standard and do it right or we should not say that we adopt a standard. If you want an official statement as to the stability of the ISO-639-3, I think that I can get you that.
One also needs to remember that ISO 639-3 still reserves a block of codes beginning with the letter "q" tha are user defined. These can be used for languages that are not included in the standard. The languages where that would be needed are not going to be the most popular or best known, and are unlikely to become big wikia. If at some future time thay are given a standard code conversion should not be a huge operation.
Ec
Hoi, Many of the current Wiktionaries make use of the ISO-639-3 codes. Here there is no relevance as to the size of a prospective Wiki. We use the code to indicate that a word is in a given language. I have extended the data design of Ultimate Wiktionary to include room for a translation framework. The idea is that is will help us manage the translation the Mediawiki messages. Given that UW will be a Mediawiki application and that we want the localisation for the languages that make use of it (basically all of them) we need a way to manage the translations in a structured way..
Please have a look at
http://meta.wikimedia.org/wiki/Ultimate_Wiktionary_data_design#Managing_tran... and give me your questions, your comments.
Thanks, GerardM _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Servien, the issue is the _version_ of ISO. Some people seem to be worried about the stability of ISO 639-3.
Mark
On 21/12/05, Servien Ilaino servien@gmail.com wrote:
I don't see the problem here all have ISO codes who cares about capitalisation or not, ISO codes are not capitalised for as far as I know and most will certainly never change and will always stay that way.
Correct and official ISO codes: nds-nl; Nedersaksisch (personal reply from ISO code commission) rmy; Romani Vlax (official code; ethnologue) pdc; Pensylvanian German (official code; ethonologue) lij; Liguarian (official code; ethnologue. current proposal "roa-lij" or "lij")
Semi-official codes: vls; West-Vlaams (the term Flemish is used as collective term for Dutch dialects in Flandres, but is mostly associated with West-Flemish, so I've heard. "gem-wvl" could be an option aswell) ksh; Ripuarian (ksh; Keulsh, Ripuarian not listed on ethnologue)
Not listed on ethnologue, "dialect code" is used (gem-, bat-, map- etc): bat-smg; Samogitian (not listed on ethnologue) qbn or map-bms; Banyumassan (not listed on ethnologue)
Servien
2005/12/21, Gerard Meijssen gerard.meijssen@gmail.com:
Ray Saintonge wrote:
Gerard Meijssen wrote:
Hoi, Actually you misunderstand me. When you mix standards and the difference between those standards disappears because of restraints that YOU are under, it means that what you do is not up to standard. Consequently it cannot be said that we implement the ISO-639 codes. When we adopt the ISO-639-3 codes we have the SIL information included. It is a custom to keep the two character ISO-639-1 codes as well. This is in my opinion the only way out of this mess. NB We have been using these codes already for over a year in Wiktionary.
Yes I know. URL's are case insensitive. We can implement ISO-639-3 and we should do this. NB there are even standard formats to indicate that a certain script is used as there are ISO codes for those too (ISO 15924).
So we either adopt a standard and do it right or we should not say that we adopt a standard. If you want an official statement as to the stability of the ISO-639-3, I think that I can get you that.
One also needs to remember that ISO 639-3 still reserves a block of codes beginning with the letter "q" tha are user defined. These can be used for languages that are not included in the standard. The languages where that would be needed are not going to be the most popular or best known, and are unlikely to become big wikia. If at some future time thay are given a standard code conversion should not be a huge operation.
Ec
Hoi, Many of the current Wiktionaries make use of the ISO-639-3 codes. Here there is no relevance as to the size of a prospective Wiki. We use the code to indicate that a word is in a given language. I have extended the data design of Ultimate Wiktionary to include room for a translation framework. The idea is that is will help us manage the translation the Mediawiki messages. Given that UW will be a Mediawiki application and that we want the localisation for the languages that make use of it (basically all of them) we need a way to manage the translations in a structured way..
Please have a look at
http://meta.wikimedia.org/wiki/Ultimate_Wiktionary_data_design#Managing_tran...
and give me your questions, your comments.
Thanks, GerardM _______________________________________________ Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
Wikitech-l mailing list Wikitech-l@wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikitech-l
-- "Take away their language, destroy their souls." -- Joseph Stalin
Ævar Arnfjörð Bjarmason <avarab@...> writes:
Note: Some of these proposals propose using the three letter sil codes for the projects in question, this is higly unadvisable because there might be an ISO code issued that equals that code in the future.
Hi, Ævar!
The codes used in the current edition of Ethnologue are a subset of ISO 639-3 (other subsets being for extinct and constructed languages, and for macro-languages), and they have already been completely harmonised with ISO 639-2. I'm not sure how many individual code changes they made, except that they changed all their codes from upper to lower case.
And the SIL page on ISO 639-3 claims that it is "official site of the ISO 639-3 Registration Authority and thus is the only one authorized by ISO", so there is no chance of any future code conflicts.
There is a complete list of codes on the site, which also includes the ISO 639-1 and ISO 639-2 codes for comparison.
Hyunsung
wikitech-l@lists.wikimedia.org