A proclamation
I support the continuation of the Klingon Wikipedia, because it seems that a (very rough) consensus has been reached. It's worth noting, too, that Klingon is a "special case" in many ways in the geek culture where we all live, and that my own confusing remarks caused it to be created, and then deleted, causing hurt feelings which are fully my own fault and much regretted.
In short, this is a unique historical situation that ought not to be viewed as creating a precedent. I feel the same way about the sep11 wiki, a project that we likely would not have undertaken or continued to support, except for a set of unique historical facts about how our project has evolved.
I'm not really ready to declare an _exact_ policy for future cases, and it would be inappropriate of me to do so until we have more consensus building, but I think that we can easily recognize the broad outlines of a reasonable policy...
1. We ought to use some external source or sources to determine what we will count as a language for the purposes of creating new projects.
2. Based on those external sources, we ought to have a mechanical rule that generates a default answer. For example, Klingon has a 3 letter ISO 639-2 code, so the default is "accept". "Toki Pona" has no such code, so the default is "reject". (However, I have been convinced by sound argument that the ISO 639 codes are often drawn along political lines rather than real linguistic lines, so a more sophisticated rule is likely needed than in this simple example.)
3. In cases where there is significant community support or opposition to the result given by the "rule" we should use some additional procedure, such as a vote, or better yet a discussion with an eye toward reaching some compromise, followed by a vote.
4. Reasons for overriding the "rule" could range from the rule being wrong to historical precedent. For example, we might decide ultimately to keep 'toki pona' even though in other cases, we would not. While consistency is a value, special cases can be made as a courtesy when a significant community has grown up.
----
I think almost everyone can agree with the outline above, but mostly because it's an abstract procedure that leaves us with no concrete guidance. ;-) I'm good at that. But I freely admit that the hard part is settling on a "rule" for the future.
Some things that I think people can agree on about what the rule should look like:
1. The rule should not tell us to have separate wikipedias for British English and Australian English and American English. (Nor for "African-American Vernacular English", popularly called "ebonics", nor for "Southern American English", my own native dialect.)
2. The rule should provide some means of exclusion for vanity projects and extremely small (and thus unlikely to be successful) groups.
3. The rule should be external to Wikipedia, based on some other official standards. The reason for this is that this is only our default, and the whole purpose of the rule is to give us one less thing to argue about. Let some international body make the decision, and then we follow it unless we do something unusual.
--Jimbo
Jimmy Wales wrote:
- The rule should not tell us to have separate wikipedias for
British English and Australian English and American English. (Nor for "African-American Vernacular English", popularly called "ebonics", nor for "Southern American English", my own native dialect.)
- The rule should provide some means of exclusion for vanity
projects and extremely small (and thus unlikely to be successful) groups.
- The rule should be external to Wikipedia, based on some other
official standards. The reason for this is that this is only our default, and the whole purpose of the rule is to give us one less thing to argue about. Let some international body make the decision, and then we follow it unless we do something unusual.
The Ethnologue, a language catalogue published by SIL International, does all of these things. SIL is a non-profit organisation dedicated to linguistics, language documentation and literacy. Their catalogue makes a division between languages and dialects based on linguistic rather than national concerns. They list 6,800 "main languages", plus dialects and alternate names. This is as opposed to ISO's approximately 490 "languages", many of which even they admit are actually groups of languages.
SIL seems to have little time for constructed languages, listing only three. ISO 639-2, on the other hand, has a policy allowing any language with more than 50 documents to obtain a code. Hence, Klingon is included in ISO's short list, but not in SIL's much longer one.
My proposal is to automatically allow any language considered one of SIL's main languages, and to only seek community approval when it is not listed. I think we should largely ignore the ISO list.
-- Tim Starling
My personal experience in dealing with software internationalization and computing in minority languages, is that SIL can't be trusted, specially where it draws a line. ISO 639 started with a background in what US Library of Congress was doing in categorizing books (written form of languages). On the other hand, SIL started to find about very local languages (many of which are not written, or share a written form with a more common language), to help religious evangelists to find about minor communities and translate the bible to the language. I believe the written vs spoken distinction is very important here.
As an example, in the case of the Persian language, which I helped start its wikipedia, we clearly could easily unify the encyclopedias for what ISO 639 calls "Persian", and what SIL is calling Eastern Farsi, Western Farsi, and Hazaragi. These three are sometimes pronounced so differently which makes it impossible for a Tehrani speaker to understand a native Kabuli, but are so similiar when written, that we can write a single encyclopedic article that is grammatically correct. SIL's Persian group, which can be found at http://www.ethnologue.com/show_family.asp?subid=1000 contains some vague cases, but also the clear case of Tajiki which can't be unified with the Persian wikipedia, since it's written in Cyrillic, not Arabic. And guess what? ISO 639 already separates Tajiki and gives it a separate entry and code.
On Mon, 31 May 2004 21:22:34 +1000, Tim Starling ts4294967296@hotmail.com wrote:
Jimmy Wales wrote:
- The rule should not tell us to have separate wikipedias for
British English and Australian English and American English. (Nor for "African-American Vernacular English", popularly called "ebonics", nor for "Southern American English", my own native dialect.)
- The rule should provide some means of exclusion for vanity
projects and extremely small (and thus unlikely to be successful) groups.
- The rule should be external to Wikipedia, based on some other
official standards. The reason for this is that this is only our default, and the whole purpose of the rule is to give us one less thing to argue about. Let some international body make the decision, and then we follow it unless we do something unusual.
The Ethnologue, a language catalogue published by SIL International, does all of these things. SIL is a non-profit organisation dedicated to linguistics, language documentation and literacy. Their catalogue makes a division between languages and dialects based on linguistic rather than national concerns. They list 6,800 "main languages", plus dialects and alternate names. This is as opposed to ISO's approximately 490 "languages", many of which even they admit are actually groups of languages.
SIL seems to have little time for constructed languages, listing only three. ISO 639-2, on the other hand, has a policy allowing any language with more than 50 documents to obtain a code. Hence, Klingon is included in ISO's short list, but not in SIL's much longer one.
My proposal is to automatically allow any language considered one of SIL's main languages, and to only seek community approval when it is not listed. I think we should largely ignore the ISO list.
-- Tim Starling
Wikipedia-l mailing list Wikipedia-l@Wikimedia.org http://mail.wikipedia.org/mailman/listinfo/wikipedia-l
From: "Tim Starling"
SIL seems to have little time for constructed languages, listing only three. ISO 639-2, on the other hand, has a policy allowing any language with more than 50 documents to obtain a code. Hence, Klingon is included in ISO's short list, but not in SIL's much longer one.
My proposal is to automatically allow any language considered one of SIL's main languages, and to only seek community approval when it is not listed. I think we should largely ignore the ISO list.
Hi Tim, The SIL ethnologue list is quite flawed. In this respect the ISO codes are more dependable...
The Ethnologue lists http://www.ethnologue.com/show_family.asp?subid=827 Esperanto, Europanto, and Interlingua. It further mentions that Interlingua is a language of France... http://www.ethnologue.com/show_language.asp?code=INR It also claims that Esperanto is a language of France, and that it has "200 to 2,000 people who speak it as first language". If so it would be a natural and non-artificial language for them wouldn't it, those French native speakers of Esperanto.... Highly irregular!
The list is flawed, and the fact that they include "Europanto" is quite a joke, no kidding, Europanto was a joke language developed by translators within the EU and only for amusement. To exclude Volapük which had at one time hundreds of thousands of learners and users and still has a small community of active users is just wrong if one is going to include "Europanto" which no one really uses as a community except joking translators within the EU Brussels, European Union buildings... as Ethnologue points out.
The Ethnologue list is definitely flawed and worse as a resource in this respect than the use of ISO codes.
With regards, Jay B.
Jay Bowks wrote:
The Ethnologue lists http://www.ethnologue.com/show_family.asp?subid=827 Esperanto, Europanto, and Interlingua.
[...]
It also claims that Esperanto is a language of France, and that it has "200 to 2,000 people who speak it as first language". If so it would be a natural and non-artificial language for them wouldn't it, those French native speakers of Esperanto.... Highly irregular!
OK, I have no idea why it claims Esperanto to be a language of France, but the information about native speakers is as accurate as is known (though of course not all of them, in fact probably very few of them, live in France).
There being native speakers obviously doesn't change the fact that Esperanto is constructed, not natural.
Timwi
Jay Bowks wrote:
The SIL ethnologue list is quite flawed. In this respect the ISO codes are more dependable...
The Ethnologue lists http://www.ethnologue.com/show_family.asp?subid=827 Esperanto, Europanto, and Interlingua. It further mentions that Interlingua is a language of France... http://www.ethnologue.com/show_language.asp?code=INR It also claims that Esperanto is a language of France, and that it has "200 to 2,000 people who speak it as first language". If so it would be a natural and non-artificial language for them wouldn't it, those French native speakers of Esperanto.... Highly irregular!
The list is flawed, and the fact that they include "Europanto" is quite a joke, no kidding, Europanto was a joke language developed by translators within the EU and only for amusement. To exclude Volapük which had at one time hundreds of thousands of learners and users and still has a small community of active users is just wrong if one is going to include "Europanto" which no one really uses as a community except joking translators within the EU Brussels, European Union buildings... as Ethnologue points out.
We're not interested in language classification, number of native speakers or country of origin, so I don't really care if there are errors in that respect. We are not suggesting using the ethnologue as the only source for our articles on languages. Mistakes in the inclusion or non-inclusion of languages I'll admit is a more serious issue. However I did point out that SIL has little time for artificial languages, so mistakes in this section are hardly suprising.
-- Tim Starling
On May 31, 2004, at 6:26 AM, Jimmy Wales wrote:
A proclamation ...3. The rule should be external to Wikipedia, based on some other official standards. The reason for this is that this is only our default, and the whole purpose of the rule is to give us one less thing to argue about. Let some international body make the decision, and then we follow it unless we do something unusual.
Hm. I have no horse in this race, but isn't the sole determinant of a successful wikipedia the number of active contributors?
Regardless of any external language codes (we certainly don't have wikis for all the languages under any of the aforementioned schemes), it all comes down to an issue of who contributes. If a language had *no* official codes or recognition by linguists, but 500 contributors, it will probably be more successful than any officially recognized language that only has one or two contributors.
By using a "contributor" metric, we can prevent the small vanity (and usually dead end) projects, and foster viable projects, regardless of their "official" recognition by any given standards body. Of course, then we'd have to argue about the number of individuals required to start a project, but a contributor metric bypasses the whole "is it a real language" issue entirely.
-Bop
--- Ronald Chmara ron@Opus1.COM wrote:
Regardless of any external language codes (we certainly don't have wikis for all the languages under any of the aforementioned schemes), it all comes down to an issue of who contributes. If a language had *no* official codes or recognition by linguists, but 500 contributors, it will probably be more successful than any officially recognized language that only has one or two contributors.
Although if a language *does* have an official language code and linguists that that recognize it, then there should never be a question that we should start a Wikipedia in that language soon after somebody volunteers to translate the interface. Very few contributers or not.
Minus that, then we should look at a more complicated mix of considerations (and the number of contributors idea is a good thing to look at).
All IMO, of course.
-- Daniel Mayer (aka mav)
__________________________________ Do you Yahoo!? Friends. Fun. Try the all-new Yahoo! Messenger. http://messenger.yahoo.com/
Lainaus Jimmy Wales jwales@bomis.com:
A proclamation
I support the continuation of the Klingon Wikipedia, because it seems that a (very rough) consensus has been reached. It's worth noting, too, that Klingon is a "special case" in many ways in the geek culture where we all live, and that my own confusing remarks caused it to be created, and then deleted, causing hurt feelings which are fully my own fault and much regretted.
Ahhah.
In short, this is a unique historical situation that ought not to be viewed as creating a precedent. I feel the same way about the sep11 wiki, a project that we likely would not have undertaken or continued to support, except for a set of unique historical facts about how our project has evolved.
Yeah. Sure; absolutely true.
I'm not really ready to declare an _exact_ policy for future cases, and it would be inappropriate of me to do so until we have more consensus building, but I think that we can easily recognize the broad outlines of a reasonable policy...
Snipping here, because all of it has been designed to be non-controversial, and indeed has been nicely constructed with that view.
...
...
...
I think almost everyone can agree with the outline above, but mostly because it's an abstract procedure that leaves us with no concrete guidance. ;-) I'm good at that. But I freely admit that the hard part is settling on a "rule" for the future.
Some things that I think people can agree on about what the rule should look like:
- The rule should not tell us to have separate wikipedias for
British English and Australian English and American English. (Nor for "African-American Vernacular English", popularly called "ebonics", nor for "Southern American English", my own native dialect.)
Actually (and this is just about my only objection) you seem to be falling into the same mistake here that was perpetrated with clumping Toki Pona, Esperanto and Klingon with a bunch of other languages; except you are doing it with a view to exclusion, rather than inclusion.
It is a singularly unhelpful approach to decide beforehand which languages you want a universal rule to "justify" and which to "condemn"; and then to _design_ a "universal" rule that would produce the wanted result.
To prove that I am not making a purely idle procedural point; consider the grouping of "ebonics" here with the other varieties of speech.
As far as I understand it, Ebonics is motivated by a genuine political desire towards recognition (even if not separatism). It therefore may well have the "legs" to keep it going till eventual _actual_ separation from the English Language occurs. This is supported by the fact that the people at the center of the process have actually been _describing_ actual differences of grammar between this "vernacular" and "English proper".
To take a strong stance against including a "Ebonics" wikipedia, when we really have a poor view of its future evolution, is in my opinion counter- productive.
To try to gerrymander a "universal rule" which would keep it out, would (frankly) be ludicrous.
- The rule should provide some means of exclusion for vanity
projects and extremely small (and thus unlikely to be successful) groups.
- The rule should be external to Wikipedia, based on some other
official standards. The reason for this is that this is only our default, and the whole purpose of the rule is to give us one less thing to argue about. Let some international body make the decision, and then we follow it unless we do something unusual.
--Jimbo
wikipedia-l@lists.wikimedia.org