Here are some bad and some good news...
The bad news is that I've finally realized why I needed a separate wiki for data. It's about restrictive Ethnologue's ToS [1]. In other words, I could say to myself just: Welcome back to the wonderful world of licenses!
So, I've created a private wiki with some of the data. Anyone willing to join me in "data analysis" work is welcome; I'll create accounts on that wiki. Said so, I urge to all relevant persons to contact me privately with preferred username. (And if I have to be more precise, this is related to the languages, chapters, WMF and its funds.) I also need one or more persons willing to code in Python.
Good news is that I've realized that I did good job in coding, with a number of relevant categorizations; which triggers a bad news because I'd need some time to get familiarized with my code again.
The data about the number of not represented languages on Wikimedia projects: * 23 languages with more than 10 millions of speakers * 230 languages with more than one million of speakers * 866 languages with more than 100 thousands of speakers * 1831 languages with more than 10 thousands of speakers
The largest language with the project in Incubator has 38 millions of speakers.
[1] http://www.ethnologue.com/terms-use
On Sat, Apr 26, 2014 at 2:11 PM, Seb35 seb35wikipedia@gmail.com wrote:
Hei,
As a supporter of language diversity, I'm a bit sad of this thread because some people find we should not engage in language revitalisation because: 1/ it's not explicitely in our scope (and I don't fully aggree: "sum of all knowledge" also includes minority cultures expressed in their languages, as shown by Hubert Laska with the "Kneip"), 2/ it's too difficult/expansive "to save most languages".
Although there are obviously great difficulties, I find it shouldn't stop us to support or partnership with local languages institutions, particularly if there are interested people or volunteers: we are not obliged to select the 3000 more spoken languages and set up parterships to "save" these 3000 languages, but we can support institutions or volunteers _interested_ in saving some small language on a case-by-case basis (Rapa Nui, Chickasaw, Skolt Sami, Kibushi, whatever) if minimum requirements are met (writing system and ISO 639 code for a website, financial ressources for a project), i.e. crowdsourcing the language preservation between Wikimedia, volunteers, speakers, and institutions.
When multilinguism in the cyberspace is discussed by linguists, Wikipedia is almost every time shown as *the* better successful example. As discussed in this thread, perhaps some projects (Wikisource, Wiktionary, Wikidata) are easier to set up in these languages and this could be a first step, but these will only preserve these as non-living objects of interest, at the contrary of a Wikibook/Wikipedia/Wikinews/Wikiversity where speakers could practice the language, invent neologisms and terminology, create corpora for linguists, and show the language to other interested people in the world (I'm sure there are).
As an example in France, Wikimédia France has quite good relationships with the DGLFLF (Delegation for the French language and languages of France), and this institution census 75 languages in France, whose 2/3 are overseas [1]. The DGLFLF contributed ressources on some small languages and multilinguism on Wikibooks [2] and Commons [3].
[1] (fr) http://www.culture.gouv.fr/culture/dglf/lgfrance/lgfrance_presentation.htm [2] (fr) https://fr.wikibooks.org/wiki/%C3%89tats_g%C3%A9n%C3%A9raux_du_multilinguism... [3] (fr)(mul) https://commons.wikimedia.org/wiki/Category:%C3%89tats_g%C3%A9n%C3%A9raux_du...
~ Seb35
20.04.2014 05:46:47 (CEST), Milos Rancic kirjoitti:
There are ~6000 languages in the world and around 3000 of them have more than 10,000 speakers.
That approximation has some issues, but they are compensated by the ambiguity of the opposition. Ethnologue is not the best place to find precise data about the languages and it could count as languages just close varieties of one language, but it also doesn't count some other languages. Not all of the languages with 10,000 or more speakers have positive attitude toward their languages, but there are languages with smaller number of speakers with very positive attitude toward their own language.
So, that number is what we could count as the realistic "final" number of the language editions of Wikimedia projects. At the moment, we have less than 300 language editions.
There is the question: Why should we do that? The answer is clear to me: Because we can.
Yes, there are maybe more specific organizations which could do that, but it's not about expertise, but about ability. Fortunately, we don't need to search for historical examples for comparisons; the Internet is good enough.
I still remember infographic of the time while all of us thought that Flickr is the place for images. It turned out that the biggest repository of images is actually Facebook, which had hundred times more of them than the Twitpic at the second place, which, in turn, had hundred times more of images than Flickr.
In other words, the purpose of something and general perception of its purpose is not enough for doing good job. As well as comparisons between mismanaged internet projects and mismanaged traditional scientific and educational organizations are numerous.
At this point of time Wikimedia all necessary capacities -- and even a will to take that job. So, we should start doing that, finally :)
There is also the question: How can we do that? In short, because of Wikipedia.
I announced Microgrants project of Wikimedia Serbia yesterday. To be honest, we have very low expectations. When I said to Filip that I want to have 10 active community members after the project, he said that I am overambitious. Yes, I am.
But ten hours later I've got the first response and I was very positively surprised by a lot of things. The most relevant for this story is that a person from a city in Serbia proper is very enthusiastic about Wikipedia and contributing to it (and organizing contributors in the area). I didn't hear that for years! (Maybe I was just too pessimistic because of my obsession with statistics.)
Keeping in mind her position (she said that she was always complaining about lack of material on Serbian Wikipedia, although at this point of time it's the encyclopedia in Serbian with the most relevant content) and her enthusiasm, I am completely sure that many speakers of many small languages are dreaming from time to time to have Wikipedia in their native language.
Like in the case of a Serbian from the fifth or sixth largest city in Serbia, I am sure that they just don't know how to do that. So, it's up to us to reach them.
English Wikipedia has some influences on contemporary English language ("citation needed", let's say). It has more influences on languages with smaller number of speakers, like Serbian is (Cyrillic/Latin cultural war in Serbia was over at the moment when Serbian Wikipedia implemented transliteration engine; it's no issue now, while it was the issue up to mid 2000s).
But it's about well developed languages in the cultural sense. What about not that developed ones? While I don't have an example of the effects (anyone, please?), counting the amount of the written materials in some languages, Wikipedia will (or already has) become the biggest book, sometimes the biggest library in that language; in some cases Wikipedia will create the majority of texts written in particular language!
While we think about Wikipedia as valuable resource for learning about wide range of the topics, significance of Wikipedia for those peoples would be much higher. If we do the job, there will be many monuments to Wikipedia all over the world, because Wikipedia would preserve many cultures, not just the languages.
There is the question "How?", at the end. There are numerous of possible ways and there are also some tries to do that, but we have to create the plan how to do that systematically, well, according to our principles and goals and according to the reality.
What we know from our previous experiences:
- The number of editors has declined and, at the moment, without a
miracle (or hard work, but I assume the most of our movement is used to miracles, not to hard work), the trend will continue. Contrary to that, number of readers has increased. Unfortunately, in this case a miracle is not necessary for that trend to end.
- If we count languages with relevant statistics for editors per
million, the top of them belong either to the highly motivated communities (Hebrew), either to the rich countries with harsh climate, which makes writing on Wikipedia as a good fun (Estonian, Icelandic, Norwegian, Finish), either to the community which belongs to the both categories (Scots Gaelic). And it's around 100 users per million.
If a community has 100,000 of speakers, it would mean that the community would have 10 editors with 5 or more edits per month. In the cases of the languages with 10,000 of speakers, it would mean 1 editor with 5 or more edits per month. That won't work.
I'd say that Scots Gaelic could be a good test (Wikimedia UK help needed!). It's a language with ~70k of speakers and if it's possible to achieve 100 active editors per month, we could say that it could somehow work in other cases, as well.
- Besides preserving languages and cultural heritage, we want to have
useful information on those Wikipedias. That's a tough job for many communities because of various issues: from the lack of reasonable internet access to the inherent cultural biases.
But we have some tools -- Wikidata as the most important one -- to create a lot of useful content.
But the entrance level is very high. Editors have to know to use computers well, as well as to think quite formally. That's serious obstacle in areas without well developed educational systems.
- Good news is that we have chapters in three countries with a lot of
languages: India, Indonesia and Australia (though, it's about very small languages in Australia; though, Australia is much richer). So, we have organizational potential.
- There are, of course, a lot of other issues. Many of them, actually.
But if we wouldn't start, we wouldn't do anything.
As you could see, I wrote this not as a kind of plan, but as the set of open questions. I'd like your input (first here, then on Meta): What do you think? How can we start working on it? What do you think it would be the most efficient way? Ways? Any other idea?
I'd call you to give wings to your imagination. To be able to solve that, we need bold ideas. At the other side, I'd appreciate people with more organizational skills to give their input, as well.
Wikimedia-l mailing list Wikimedia-l@lists.wikimedia.org Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, mailto:wikimedia-l-request@lists.wikimedia.org?subject=unsubscribe