Hi,
A question for the respected language code experts in the audience.
Apparently, the Simple English Wikipedia uses "en" as its language code in the HTML lang attribute, etc.
I never noticed it until recently, when it started causing various bugs with the ContentTranslation extension of which I am a developer. I somehow assumed that it uses something like "en-simple" without ever checking it, and that assumption was wrong - it's just "en".
I believe that the code should be different from what is used by the English Wikipedia, like it is with other wikis in language variants, such as be-tarask.
Do you have any suggestions about what code should it be? en-simple? en-x-simple? Something else? Should I register anything new with any standards organization? (If I recall correctly, this was done for be-tarask?) Can I reuse any existing code that would be appropriate? Is it a bad idea in general and it should be just "en"?
Thanks!
(PS: If you're curious what are the issues, see https://phabricator.wikimedia.org/T110190 )
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore
Hoi, Linguistically it is the same language. There is no reason why it needs to be a different code. The standard allows for a code that is different. The code en-simple is not in conformance with the standard. Thanks, GerardM
On 25 August 2015 at 18:35, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
Hi,
A question for the respected language code experts in the audience.
Apparently, the Simple English Wikipedia uses "en" as its language code in the HTML lang attribute, etc.
I never noticed it until recently, when it started causing various bugs with the ContentTranslation extension of which I am a developer. I somehow assumed that it uses something like "en-simple" without ever checking it, and that assumption was wrong - it's just "en".
I believe that the code should be different from what is used by the English Wikipedia, like it is with other wikis in language variants, such as be-tarask.
Do you have any suggestions about what code should it be? en-simple? en-x-simple? Something else? Should I register anything new with any standards organization? (If I recall correctly, this was done for be-tarask?) Can I reuse any existing code that would be appropriate? Is it a bad idea in general and it should be just "en"?
Thanks!
(PS: If you're curious what are the issues, see https://phabricator.wikimedia.org/T110190 )
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
On 25 Aug 2015, at 18:34, Gerard Meijssen gerard.meijssen@gmail.com wrote:
Linguistically it is the same language. There is no reason why it needs to be a different code. The standard allows for a code that is different. The code en-simple is not in conformance with the standard.
Please explain what exactly you mean. “The standard (which?) allows for a code that is different (from what?)”. How is “en-simple” not in conformance with “the standard” (which?)?
Michael Everson * http://www.evertype.com/
Indeed, that's a problem!
Imho, we'll have to deal with the status of Simple English in general. Any chance of successfully applying an ISO code for it? Otherwise, we'll run into more problems like this one https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikinews_Simple_English [1] with new project applications (for example, see Jon Harald's reply under "arguments in favour"). While Wikipedia https://simple.wikipedia.org/wiki/Basic_English [2] defines Basic English (on which Simple English is based) as constructed language in its own right, that definition doesn't seem to have been accepted widely.
If push comes to shove, I'd vote for en-simple. I had actually assumed that that was its encoding used already (based on the URL _*simple*_.wikipedia.org).
Fwiw, Oliver
[1] https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikinews_Simple_E... [2] https://simple.wikipedia.org/wiki/Basic_English
On 25-Aug-15 7:35 PM, Amir E. Aharoni wrote:
Hi,
A question for the respected language code experts in the audience.
Apparently, the Simple English Wikipedia uses "en" as its language code in the HTML lang attribute, etc.
I never noticed it until recently, when it started causing various bugs with the ContentTranslation extension of which I am a developer. I somehow assumed that it uses something like "en-simple" without ever checking it, and that assumption was wrong - it's just "en".
I believe that the code should be different from what is used by the English Wikipedia, like it is with other wikis in language variants, such as be-tarask.
Do you have any suggestions about what code should it be? en-simple? en-x-simple? Something else? Should I register anything new with any standards organization? (If I recall correctly, this was done for be-tarask?) Can I reuse any existing code that would be appropriate? Is it a bad idea in general and it should be just "en"?
Thanks!
(PS: If you're curious what are the issues, see https://phabricator.wikimedia.org/T110190 )
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Dear all,
The odds for Simple English obtaining an ISO 639-3 code are zilch, because these codes are given to languages and Simple English is not a language, at least not a language separate from English. It's merely a way of speaking and writing it – just like "difficult English", children's English, broken English or even Pig Latin are. It doesn't have any rules of its own, no separate grammar and no separate word stock, all it does is saying: "Try to write in short sentences and avoid difficult words". You can do the same thing with any other language as well, which might equally well result in Simple German, Simple Rhaeto-Romance or Simple Inuktitut.
The only small change would be applying for an ISO code for Ogden's Basic English, which is generally treated as a constructed language. But that would be cheating, because the Simple English Wikipedia does not limit itself to Ogden's word lists, even though it endorses them. Besides, the ISO registration authority is pretty tough nowadays when it comes to constructed languages, and I don't think they would accept Basic English.
Best regards, Jan van Steenbergen
2015-08-25 20:25 GMT+02:00 Oliver Stegen info@oliverstegen.net:
Indeed, that's a problem!
Imho, we'll have to deal with the status of Simple English in general. Any chance of successfully applying an ISO code for it? Otherwise, we'll run into more problems like this one https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikinews_Simple_English [1] with new project applications (for example, see Jon Harald's reply under "arguments in favour"). While Wikipedia https://simple.wikipedia.org/wiki/Basic_English [2] defines Basic English (on which Simple English is based) as constructed language in its own right, that definition doesn't seem to have been accepted widely.
If push comes to shove, I'd vote for en-simple. I had actually assumed that that was its encoding used already (based on the URL *simple*. wikipedia.org).
Fwiw, Oliver
[1] https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikinews_Simple_E... [2] https://simple.wikipedia.org/wiki/Basic_English
On 25-Aug-15 7:35 PM, Amir E. Aharoni wrote:
Hi,
A question for the respected language code experts in the audience.
Apparently, the Simple English Wikipedia uses "en" as its language code in the HTML lang attribute, etc.
I never noticed it until recently, when it started causing various bugs with the ContentTranslation extension of which I am a developer. I somehow assumed that it uses something like "en-simple" without ever checking it, and that assumption was wrong - it's just "en".
I believe that the code should be different from what is used by the English Wikipedia, like it is with other wikis in language variants, such as be-tarask.
Do you have any suggestions about what code should it be? en-simple? en-x-simple? Something else? Should I register anything new with any standards organization? (If I recall correctly, this was done for be-tarask?) Can I reuse any existing code that would be appropriate? Is it a bad idea in general and it should be just "en"?
Thanks!
(PS: If you're curious what are the issues, see https://phabricator.wikimedia.org/T110190 )
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore
Langcom mailing listLangcom@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
I wasn't talking about a new code, just a subtag.
The question is which one, and whether it should be registered or should we just make up something and use it. בתאריך 28 באוג׳ 2015 07:57, "Jan van Steenbergen" ijzeren.jan@gmail.com כתב:
Dear all,
The odds for Simple English obtaining an ISO 639-3 code are zilch, because these codes are given to languages and Simple English is not a language, at least not a language separate from English. It's merely a way of speaking and writing it – just like "difficult English", children's English, broken English or even Pig Latin are. It doesn't have any rules of its own, no separate grammar and no separate word stock, all it does is saying: "Try to write in short sentences and avoid difficult words". You can do the same thing with any other language as well, which might equally well result in Simple German, Simple Rhaeto-Romance or Simple Inuktitut.
The only small change would be applying for an ISO code for Ogden's Basic English, which is generally treated as a constructed language. But that would be cheating, because the Simple English Wikipedia does not limit itself to Ogden's word lists, even though it endorses them. Besides, the ISO registration authority is pretty tough nowadays when it comes to constructed languages, and I don't think they would accept Basic English.
Best regards, Jan van Steenbergen
2015-08-25 20:25 GMT+02:00 Oliver Stegen info@oliverstegen.net:
Indeed, that's a problem!
Imho, we'll have to deal with the status of Simple English in general. Any chance of successfully applying an ISO code for it? Otherwise, we'll run into more problems like this one https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikinews_Simple_English [1] with new project applications (for example, see Jon Harald's reply under "arguments in favour"). While Wikipedia https://simple.wikipedia.org/wiki/Basic_English [2] defines Basic English (on which Simple English is based) as constructed language in its own right, that definition doesn't seem to have been accepted widely.
If push comes to shove, I'd vote for en-simple. I had actually assumed that that was its encoding used already (based on the URL *simple*. wikipedia.org).
Fwiw, Oliver
[1] https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikinews_Simple_E... [2] https://simple.wikipedia.org/wiki/Basic_English
On 25-Aug-15 7:35 PM, Amir E. Aharoni wrote:
Hi,
A question for the respected language code experts in the audience.
Apparently, the Simple English Wikipedia uses "en" as its language code in the HTML lang attribute, etc.
I never noticed it until recently, when it started causing various bugs with the ContentTranslation extension of which I am a developer. I somehow assumed that it uses something like "en-simple" without ever checking it, and that assumption was wrong - it's just "en".
I believe that the code should be different from what is used by the English Wikipedia, like it is with other wikis in language variants, such as be-tarask.
Do you have any suggestions about what code should it be? en-simple? en-x-simple? Something else? Should I register anything new with any standards organization? (If I recall correctly, this was done for be-tarask?) Can I reuse any existing code that would be appropriate? Is it a bad idea in general and it should be just "en"?
Thanks!
(PS: If you're curious what are the issues, see https://phabricator.wikimedia.org/T110190 )
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore
Langcom mailing listLangcom@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
In term of subtag, there is an -x- private use extension in IETF language tag standard, which mean it is possible to make something like be-x-old or en-x-simple. On the other hand, the standard allow variant extension, like it is possoble to indicate Early Modern English by writing en-emodeng according to standard, however those variant code should be registered via *IANA* Language *Subtag Registry*. 2015年8月28日 下午1:16於 "Amir E. Aharoni" amir.aharoni@mail.huji.ac.il寫道:
I wasn't talking about a new code, just a subtag.
The question is which one, and whether it should be registered or should we just make up something and use it. בתאריך 28 באוג׳ 2015 07:57, "Jan van Steenbergen" ijzeren.jan@gmail.com כתב:
Dear all,
The odds for Simple English obtaining an ISO 639-3 code are zilch, because these codes are given to languages and Simple English is not a language, at least not a language separate from English. It's merely a way of speaking and writing it – just like "difficult English", children's English, broken English or even Pig Latin are. It doesn't have any rules of its own, no separate grammar and no separate word stock, all it does is saying: "Try to write in short sentences and avoid difficult words". You can do the same thing with any other language as well, which might equally well result in Simple German, Simple Rhaeto-Romance or Simple Inuktitut.
The only small change would be applying for an ISO code for Ogden's Basic English, which is generally treated as a constructed language. But that would be cheating, because the Simple English Wikipedia does not limit itself to Ogden's word lists, even though it endorses them. Besides, the ISO registration authority is pretty tough nowadays when it comes to constructed languages, and I don't think they would accept Basic English.
Best regards, Jan van Steenbergen
2015-08-25 20:25 GMT+02:00 Oliver Stegen info@oliverstegen.net:
Indeed, that's a problem!
Imho, we'll have to deal with the status of Simple English in general. Any chance of successfully applying an ISO code for it? Otherwise, we'll run into more problems like this one https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikinews_Simple_English [1] with new project applications (for example, see Jon Harald's reply under "arguments in favour"). While Wikipedia https://simple.wikipedia.org/wiki/Basic_English [2] defines Basic English (on which Simple English is based) as constructed language in its own right, that definition doesn't seem to have been accepted widely.
If push comes to shove, I'd vote for en-simple. I had actually assumed that that was its encoding used already (based on the URL *simple*. wikipedia.org).
Fwiw, Oliver
[1] https://meta.wikimedia.org/wiki/Requests_for_new_languages/Wikinews_Simple_E... [2] https://simple.wikipedia.org/wiki/Basic_English
On 25-Aug-15 7:35 PM, Amir E. Aharoni wrote:
Hi,
A question for the respected language code experts in the audience.
Apparently, the Simple English Wikipedia uses "en" as its language code in the HTML lang attribute, etc.
I never noticed it until recently, when it started causing various bugs with the ContentTranslation extension of which I am a developer. I somehow assumed that it uses something like "en-simple" without ever checking it, and that assumption was wrong - it's just "en".
I believe that the code should be different from what is used by the English Wikipedia, like it is with other wikis in language variants, such as be-tarask.
Do you have any suggestions about what code should it be? en-simple? en-x-simple? Something else? Should I register anything new with any standards organization? (If I recall correctly, this was done for be-tarask?) Can I reuse any existing code that would be appropriate? Is it a bad idea in general and it should be just "en"?
Thanks!
(PS: If you're curious what are the issues, see https://phabricator.wikimedia.org/T110190 )
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore
Langcom mailing listLangcom@lists.wikimedia.orghttps://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
You might register a subtag for it. It’s not a different language from English so you’d never get a 639=code for it.
On 25 Aug 2015, at 17:35, Amir E. Aharoni amir.aharoni@mail.huji.ac.il wrote:
Apparently, the Simple English Wikipedia uses "en" as its language code in the HTML lang attribute, etc.
I never noticed it until recently, when it started causing various bugs with the ContentTranslation extension of which I am a developer. I somehow assumed that it uses something like "en-simple" without ever checking it, and that assumption was wrong - it's just "en".
I believe that the code should be different from what is used by the English Wikipedia, like it is with other wikis in language variants, such as be-tarask.
Do you have any suggestions about what code should it be? en-simple? en-x-simple? Something else? Should I register anything new with any standards organization? (If I recall correctly, this was done for be-tarask?) Can I reuse any existing code that would be appropriate? Is it a bad idea in general and it should be just "en"?
Thanks!
(PS: If you're curious what are the issues, see https://phabricator.wikimedia.org/T110190 )
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore _______________________________________________ Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom
Michael Everson * http://www.evertype.com/
It looks as though I will be approving “simple” so you should look into the logistics of actually transitioning from simple.wikipedia.org to en-simple.wikipedia.org
Michael Everson * http://www.evertype.com/
Lovely, thanks a lot!
-- Amir Elisha Aharoni · אָמִיר אֱלִישָׁע אַהֲרוֹנִי http://aharoni.wordpress.com “We're living in pieces, I want to live in peace.” – T. Moore
2015-12-10 16:19 GMT+02:00 Michael Everson everson@evertype.com:
It looks as though I will be approving “simple” so you should look into the logistics of actually transitioning from simple.wikipedia.org to en-simple.wikipedia.org
Michael Everson * http://www.evertype.com/
Langcom mailing list Langcom@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/langcom