Hi,
Two weeks ago Amir submitted a request to the mailing list asking folks to review the list of language names available in Names.php:
https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=language...
The request was followed up by few issues noticed for Czech, Slovak, and some other languages. It's obvious that by relying only on available input from the community, one can not make sure that the rest of the data is correct. Given this I recommended doing a minimal implementation of CLDR data. Here's what I wrote to Amir:
I skimmed through the list and haven't seen anything incorrect. I have a
question though; considering the fact that some of this data is available in CLDR, have you ever considered integrating their data and then do a fallback? The fallback would definitely be necessary in some cases because your list is *way* more extensive than what CLDR currently supports.
Of course, CLDR specs lets adding new locales easily. So the ideal would be to have a seed (with minimal information) for the locales which doesn't exists there and are present in MW list. As CLDR is peer reviewed through surveys targeted in-country scholars and standard body representatives, normally the quality of the data and metadata is very good.
In the past, there was at least this one extension I know off which was facilitating the use of CLDR data on MW: http://www.mediawiki .org/wiki/Extension:CLDR
Let me know what you think. I'd be happy to help.
I haven't received any feedback from Amir up to now and as I'm not a MW developer, I'm writing here to ask for your opinion on the matter. The bottom line is that I can script out something that cross-checks Names.php values with CLDR entries, but I think it'd better to think about a long-term solution.
Cheers, Shervin
Hoi, We do use the CLDR data and we REALLY, REALLY want people to assess the data that is in the CLDR both for correctness and completeness. Many of the languages we support in MediaWIki are not yet supported in the CLDR. We are seeking this support actively. We would also be REALLY happy when any and all languages are supported in the CLDR.
And yes, the CLDR extension is still very much in use. Regularly we did override the CLDR data because of problems with its data.. Recently an override was put in place on the Incubator to change Aurocana to Mapundungun. Thanks, Gerard
PS you CAN change CLDR data at this moment ... http://cldr.unicode.org/index/survey-tool
On Fri, Apr 13, 2012 at 9:22 PM, Shervin Afshar shervinafshar@gmail.comwrote:
Hi,
Two weeks ago Amir submitted a request to the mailing list asking folks to review the list of language names available in Names.php:
https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=language...
The request was followed up by few issues noticed for Czech, Slovak, and some other languages. It's obvious that by relying only on available input from the community, one can not make sure that the rest of the data is correct. Given this I recommended doing a minimal implementation of CLDR data. Here's what I wrote to Amir:
I skimmed through the list and haven't seen anything incorrect. I have a
question though; considering the fact that some of this data is available in CLDR, have you ever considered integrating their data and then do a fallback? The fallback would definitely be necessary in some cases because your list is *way* more extensive than what CLDR currently supports.
Of course, CLDR specs lets adding new locales easily. So the ideal would be to have a seed (with minimal information) for the locales which doesn't exists there and are present in MW list. As CLDR is peer reviewed through surveys targeted in-country scholars and standard body representatives, normally the quality of the data and metadata is very good.
In the past, there was at least this one extension I know off which was facilitating the use of CLDR data on MW: http://www.mediawiki .org/wiki/Extension:CLDR
Let me know what you think. I'd be happy to help.
I haven't received any feedback from Amir up to now and as I'm not a MW developer, I'm writing here to ask for your opinion on the matter. The bottom line is that I can script out something that cross-checks Names.php values with CLDR entries, but I think it'd better to think about a long-term solution.
Cheers, Shervin
Mediawiki-i18n mailing list Mediawiki-i18n@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n
Thanks Gerard
On Sat, Apr 14, 2012 at 2:51 PM, Gerard Meijssen gmeijssen@wikimedia.orgwrote:
Hoi, We do use the CLDR data and we REALLY, REALLY want people to assess the data that is in the CLDR both for correctness and completeness. Many of the languages we support in MediaWIki are not yet supported in the CLDR. We are seeking this support actively. We would also be REALLY happy when any and all languages are supported in the CLDR.
And yes, the CLDR extension is still very much in use. Regularly we did override the CLDR data because of problems with its data.. Recently an override was put in place on the Incubator to change Aurocana to Mapundungun. Thanks, Gerard
PS you CAN change CLDR data at this moment ... http://cldr.unicode.org/index/survey-tool
On Fri, Apr 13, 2012 at 9:22 PM, Shervin Afshar shervinafshar@gmail.comwrote:
Hi,
Two weeks ago Amir submitted a request to the mailing list asking folks to review the list of language names available in Names.php:
https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=language...
The request was followed up by few issues noticed for Czech, Slovak, and some other languages. It's obvious that by relying only on available input from the community, one can not make sure that the rest of the data is correct. Given this I recommended doing a minimal implementation of CLDR data. Here's what I wrote to Amir:
I skimmed through the list and haven't seen anything incorrect. I have a
question though; considering the fact that some of this data is available in CLDR, have you ever considered integrating their data and then do a fallback? The fallback would definitely be necessary in some cases because your list is *way* more extensive than what CLDR currently supports.
Of course, CLDR specs lets adding new locales easily. So the ideal would be to have a seed (with minimal information) for the locales which doesn't exists there and are present in MW list. As CLDR is peer reviewed through surveys targeted in-country scholars and standard body representatives, normally the quality of the data and metadata is very good.
In the past, there was at least this one extension I know off which was facilitating the use of CLDR data on MW: http://www.mediawiki .org/wiki/Extension:CLDR
Let me know what you think. I'd be happy to help.
I haven't received any feedback from Amir up to now and as I'm not a MW developer, I'm writing here to ask for your opinion on the matter. The bottom line is that I can script out something that cross-checks Names.php values with CLDR entries, but I think it'd better to think about a long-term solution.
Cheers, Shervin
Mediawiki-i18n mailing list Mediawiki-i18n@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n
Mediawiki-i18n mailing list Mediawiki-i18n@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n
Hi Gerard, all
Some time ago I was trying to get Asturian language included in CLDR [1].
Unfortunately, it took more than expected to solve a bug with our alphabet [2], and we didn't make in time for r2.0
Perhaps now is the right time to try again, but I'm not a developer and it's **really** hard for me to create the requested data. I guess I can get the collation chart from [3], but... anything else? Do you know a step by step guide for absolute beginners on how to provide the required data? I find Survey Tool (and most of the tools in CLDR) more complicated than it could be for non-technical people.
Thanks in advance for any clue you can give me. Best regards.
[1] http://unicode.org/cldr/trac/ticket/2868 [2] https://bugs.freedesktop.org/show_bug.cgi?id=32965 [3] http://developer.mimer.com/charts/asturian.htm
Hoi, I blogged about this .. http://ultimategerardm.blogspot.com/2012/04/supporting-asturian-in-cldr.html Thanks, Gerard
On 15 April 2012 02:10, Xuacu Saturio xuacusk8@gmail.com wrote:
Hi Gerard, all
Some time ago I was trying to get Asturian language included in CLDR [1].
Unfortunately, it took more than expected to solve a bug with our alphabet [2], and we didn't make in time for r2.0
Perhaps now is the right time to try again, but I'm not a developer and it's **really** hard for me to create the requested data. I guess I can get the collation chart from [3], but... anything else? Do you know a step by step guide for absolute beginners on how to provide the required data? I find Survey Tool (and most of the tools in CLDR) more complicated than it could be for non-technical people.
Thanks in advance for any clue you can give me. Best regards.
[1] http://unicode.org/cldr/trac/**ticket/2868http://unicode.org/cldr/trac/ticket/2868 [2] https://bugs.freedesktop.org/**show_bug.cgi?id=32965https://bugs.freedesktop.org/show_bug.cgi?id=32965 [3] http://developer.mimer.com/**charts/asturian.htmhttp://developer.mimer.com/charts/asturian.htm
-- Xuacu Saturio
Sent while testing Thunderbird Unviáu dende Thunderbird en pruebes
______________________________**_________________ Mediawiki-i18n mailing list Mediawiki-i18n@lists.**wikimedia.org Mediawiki-i18n@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/mediawiki-**i18nhttps://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n
I have requested for an account there. What is the processing delay there?
On Sun, Apr 15, 2012 at 1:08 PM, Gerard Meijssen gerard.meijssen@gmail.comwrote:
Hoi, I blogged about this ..
http://ultimategerardm.blogspot.com/2012/04/supporting-asturian-in-cldr.html Thanks, Gerard
On 15 April 2012 02:10, Xuacu Saturio xuacusk8@gmail.com wrote:
Hi Gerard, all
Some time ago I was trying to get Asturian language included in CLDR [1].
Unfortunately, it took more than expected to solve a bug with our alphabet [2], and we didn't make in time for r2.0
Perhaps now is the right time to try again, but I'm not a developer and it's **really** hard for me to create the requested data. I guess I can get the collation chart from [3], but... anything else? Do you know a step by step guide for absolute beginners on how to provide the required data? I find Survey Tool (and most of the tools in CLDR) more complicated than it could be for non-technical people.
Thanks in advance for any clue you can give me. Best regards.
[1] http://unicode.org/cldr/trac/**ticket/2868http://unicode.org/cldr/trac/ticket/2868 [2] https://bugs.freedesktop.org/**show_bug.cgi?id=32965https://bugs.freedesktop.org/show_bug.cgi?id=32965 [3] http://developer.mimer.com/**charts/asturian.htmhttp://developer.mimer.com/charts/asturian.htm
-- Xuacu Saturio
Sent while testing Thunderbird Unviáu dende Thunderbird en pruebes
______________________________**_________________ Mediawiki-i18n mailing list Mediawiki-i18n@lists.**wikimedia.org Mediawiki-i18n@lists.wikimedia.org https://lists.wikimedia.org/**mailman/listinfo/mediawiki-**i18nhttps://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n
Mediawiki-i18n mailing list Mediawiki-i18n@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n
Nice!
You describe quite accurately how daunting this process may be for any person trying to include a new language.
Moreover, it happens that the first attempts to include a new language are made by persons like me (translators, writers...), without a deep knowledge of what's going on in the technical side of Unicode standards and have to learn it the hard way. Oddly enough, things become easier once you pass the entry point and you can use the Survey Tool :)
Best regards.
El 15/04/12 09:38, Gerard Meijssen escribió:
Hoi, I blogged about this .. http://ultimategerardm.blogspot.com/2012/04/supporting-asturian-in-cldr.html Thanks, Gerard
On 15 April 2012 02:10, Xuacu Saturio <xuacusk8@gmail.com mailto:xuacusk8@gmail.com> wrote:
Hi Gerard, all Some time ago I was trying to get Asturian language included in CLDR [1]. Unfortunately, it took more than expected to solve a bug with our alphabet [2], and we didn't make in time for r2.0 Perhaps now is the right time to try again, but I'm not a developer and it's **really** hard for me to create the requested data. I guess I can get the collation chart from [3], but... anything else? Do you know a step by step guide for absolute beginners on how to provide the required data? I find Survey Tool (and most of the tools in CLDR) more complicated than it could be for non-technical people. Thanks in advance for any clue you can give me. Best regards. [1] http://unicode.org/cldr/trac/ticket/2868 [2] https://bugs.freedesktop.org/show_bug.cgi?id=32965 [3] http://developer.mimer.com/charts/asturian.htm -- Xuacu Saturio Sent while testing Thunderbird Unviáu dende Thunderbird en pruebes _______________________________________________ Mediawiki-i18n mailing list Mediawiki-i18n@lists.wikimedia.org <mailto:Mediawiki-i18n@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n
Mediawiki-i18n mailing list Mediawiki-i18n@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n
Hi,
Some time ago I made a comparison of MediaWiki and CLDR names, see http://translatewiki.net/wiki/User:SPQRobin/languages . I find it also important to have the right language names, and to be consistent/efficient. Something I've been thinking about for some time is to change names in MediaWiki from always uppercase to lowercase where that's normal (as in CLDR).
Also, I am planning to include all English language names (from ISO 639-3) in MediaWiki. Including native names from CLDR in core is more difficult, but we could perhaps do that in the future and then fallback where possible, as you propose.
2012/4/13 Shervin Afshar shervinafshar@gmail.com:
Hi,
Two weeks ago Amir submitted a request to the mailing list asking folks to review the list of language names available in Names.php:
https://gerrit.wikimedia.org/r/gitweb?p=mediawiki/core.git;a=blob;f=language...
The request was followed up by few issues noticed for Czech, Slovak, and some other languages. It's obvious that by relying only on available input from the community, one can not make sure that the rest of the data is correct. Given this I recommended doing a minimal implementation of CLDR data. Here's what I wrote to Amir:
I skimmed through the list and haven't seen anything incorrect. I have a question though; considering the fact that some of this data is available in CLDR, have you ever considered integrating their data and then do a fallback? The fallback would definitely be necessary in some cases because your list is *way* more extensive than what CLDR currently supports.
Of course, CLDR specs lets adding new locales easily. So the ideal would be to have a seed (with minimal information) for the locales which doesn't exists there and are present in MW list. As CLDR is peer reviewed through surveys targeted in-country scholars and standard body representatives, normally the quality of the data and metadata is very good.
In the past, there was at least this one extension I know off which was facilitating the use of CLDR data on MW: http://www.mediawiki.org/wiki/Extension:CLDR
Let me know what you think. I'd be happy to help.
I haven't received any feedback from Amir up to now and as I'm not a MW developer, I'm writing here to ask for your opinion on the matter. The bottom line is that I can script out something that cross-checks Names.php values with CLDR entries, but I think it'd better to think about a long-term solution.
Cheers, Shervin
Mediawiki-i18n mailing list Mediawiki-i18n@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-i18n
mediawiki-i18n@lists.wikimedia.org