Did anyone ever consider completely migrating WMF projects to three-letter language codes? Currently two-letter ISO 639-1 code are used whenever possible and three-letter ISO 639-2 or ISO 639-3 codes are used when a two-letter code is not available.
Among the three-letter codes currently having Wikipedias are Sicilian (scn), Kashubian (csb), Nahuatl (nah), Udmurt (udm) and Mari (mhr).
Using three-letter codes for all languages seems to me like a more egalitarian approach.
Two-letter URL's must, of course, be kept as redirects.
Can anyone think about any problems with this?
Hoi, In the ISO-639-6 there will be two three and four character codes for linguistic entities. English for instance will be known by its two character code en and not eng.
Also in the RFC about such things two characters are used in preference to three characters.
The point here is that by conforming with the best practices, we make it easy for search engines to correctly find what language is used. Consequently, it has nothing to do with egalitarianism it is just not how things are done when you used these codes.
Technically there are other considerations why you want to be careful about the use of codes. Some codes refer to macro languages and these are not eligible for new projects. Thanks, GerardM
On 30 June 2010 10:30, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
Did anyone ever consider completely migrating WMF projects to three-letter language codes? Currently two-letter ISO 639-1 code are used whenever possible and three-letter ISO 639-2 or ISO 639-3 codes are used when a two-letter code is not available.
Among the three-letter codes currently having Wikipedias are Sicilian (scn), Kashubian (csb), Nahuatl (nah), Udmurt (udm) and Mari (mhr).
Using three-letter codes for all languages seems to me like a more egalitarian approach.
Two-letter URL's must, of course, be kept as redirects.
Can anyone think about any problems with this?
-- אָמִיר אֱלִישָׁע אַהֲרוֹנִי Amir Elisha Aharoni
"We're living in pieces, I want to live in peace." - T. Moore
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
Amir,
I think this is a good idea. For the sake of consistency, we should choose a single standard to follow rather than a hodge-podge of newer standards, older (although still valid) standards, and ad hoc codes we made up on the spot (als, nrm) and custom codes (bat-smg, roa-tara, roa-rup, fiu-vro, map-bms, be-x-old). It also seems potentially confusing to me that we have codes that overlap, for example na.wp and nap.wp, ro.wp and roa-rup.wp, etc.
-m.
On Wed, Jun 30, 2010 at 1:30 AM, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
Did anyone ever consider completely migrating WMF projects to three-letter language codes? Currently two-letter ISO 639-1 code are used whenever possible and three-letter ISO 639-2 or ISO 639-3 codes are used when a two-letter code is not available.
Among the three-letter codes currently having Wikipedias are Sicilian (scn), Kashubian (csb), Nahuatl (nah), Udmurt (udm) and Mari (mhr).
Using three-letter codes for all languages seems to me like a more egalitarian approach.
Two-letter URL's must, of course, be kept as redirects.
Can anyone think about any problems with this?
-- אָמִיר אֱלִישָׁע אַהֲרוֹנִי Amir Elisha Aharoni
"We're living in pieces, I want to live in peace." - T. Moore
foundation-l mailing list foundation-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
On Wed, Jun 30, 2010 at 5:10 AM, Mark Williamson node.ue@gmail.com wrote:
Amir,
I think this is a good idea. For the sake of consistency, we should choose a single standard to follow rather than a hodge-podge of newer standards, older (although still valid) standards, and ad hoc codes we made up on the spot (als, nrm) and custom codes (bat-smg, roa-tara, roa-rup, fiu-vro, map-bms, be-x-old). It also seems potentially confusing to me that we have codes that overlap, for example na.wp and nap.wp, ro.wp and roa-rup.wp, etc.
-m.
Aside from simplifying the process of selecting new language codes, what value does consistency have in this situation?
Nathan
On Wed, Jun 30, 2010 at 10:30 AM, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
Did anyone ever consider completely migrating WMF projects to three-letter language codes? Currently two-letter ISO 639-1 code are used whenever possible and three-letter ISO 639-2 or ISO 639-3 codes are used when a two-letter code is not available.
Among the three-letter codes currently having Wikipedias are Sicilian (scn), Kashubian (csb), Nahuatl (nah), Udmurt (udm) and Mari (mhr).
Using three-letter codes for all languages seems to me like a more egalitarian approach.
Two-letter URL's must, of course, be kept as redirects.
Can anyone think about any problems with this?
I agree with you and I was thinking about this issue a lot. Without exceptional cases which should be solved by BCP47 and ISO 639-5 codes, ISO 639-3 codes should be enough for everything which Wikimedia projects need. If we are starting now with standardizing Wikimedia language codes, it would be so.
However, after 10 years we already have tradition. It is not reasonable to break links which exist for 2/3 of the time of Internet existence. The most "egalitarian approach" would be redirecting "en" to "eng" and similar. But then it would look like imposing the rule for the sake of imposing the rule.
It doesn't mean that we shouldn't do that, but it also means that we should leave that for the future. Wikimedia should work on cultural equality, but it has to do that in cooperation with other organizations.
wikimedia-l@lists.wikimedia.org