[Foundation-l] 1.3 billion of humans don't have Wikipedia in their native language
george.herbert at gmail.com
Sun May 22 11:28:04 UTC 2011
On Sun, May 22, 2011 at 4:15 AM, Milos Rancic <millosh at gmail.com> wrote:
> I am preparing document for Wikimania. Presently, I am in process of
> analyzing data (SIL , Ethnologue , Wikimedia projects). I am using
> Ethnologue data for population estimates.
> Before I started this task, I thought that the situation is not so bad
> (or good, if it is about possibility for development). I thought that we
> are around the end of languages with more than 1M of speakers. However,
> this is far from being true.
> There are no Wikipedias in 243 languages with more than 1M of speakers.
> Of those, 27 have more than 10M of speakers.
> The biggest language without any Wikimedia project is Jin Chinese, with
> 45 millions of speakers.
> Around 1 billion of people belong to the group of big languages without
> Wikipedia (or any Wikimedia project) in their language.
> Of those, 480 millions have test projects, but 550 millions don't have
> even test project; including:
> * Jin Chinese, 45M, China
> * Haryanvi, 38M, India, incubator
> * Xiang Chinese, 36, China, incubator
> * Maithili, 34M, India, incubator
> * Nigerian Pidgin, 30M, Nigeria, incubator
> * Filipino, 25M, Philippines, incubator
> * Chhattisgarhi, 17.5M, India, incubator
> * Rangpuri, 15M, Bangladesh
> * Seraiki, 13.8M, Pakistan, incubator
> * Madura, 13.6M, Indonesia, incubator
> * Haryanvi, 13M, India
> * Deccan, 12.8M, India
> * Malvi, 10.4M, India
> * Min Bei Chinese, 10.3M, China, incubator
> * Sylheti, 10.3M, Bangladesh
> Around 300 millions of people are using languages with less than 1M of
> speakers which don't have Wikipedia editions.
> Note that for all languages in the world Ethnologue gives the number of
> 6.15 billion, which is pretty accurate, counting that current estimate
> (according to Wikipedia ) is 6.92 billion and that counting speakers
> is very different from counting official population statistics.
> Those are preliminary results. We have two chapters (and strategic
> focus) in countries of the list above. Inside of the longer list, which
> should be verified, we have more chapters. I noted that there are even
> two languages of Germany without Wikipedia, but with more than million
> of speakers: Mainfränkisch and Upper Saxon (the later one without test
> The list of countries with languages with more than 1M of speakers and
> without Wikipedia is: Afghanistan, Algeria, Angola, Bangladesh, Benin,
> Bolivia, Brazil, Burkina Faso, Cameroon, Chad, China, Congo, Côte
> d’Ivoire, Democratic Republic of the Congo, Ecuador, Egypt, Equatorial
> Guinea, Eritrea, Ethiopia, Germany, Ghana, Guatemala, Guinea, India,
> Indonesia (Java and Bali), Indonesia (Kalimantan), Indonesia (Nusa
> Tenggara), Indonesia (Sulawesi), Indonesia (Sumatra), Iran, Iraq,
> Jamaica, Jordan, Kenya, Libya, Madagascar, Malawi, Malaysia
> (Peninsular), Mali, Mauritania, Morocco, Mozambique, Myanmar, Namibia,
> Niger, Nigeria, Pakistan, Paraguay, Peru, Philippines, Saudi Arabia,
> Senegal, Serbia, Sierra Leone, Somalia, South Africa, Sudan, Syria,
> Tanzania, Thailand, Tunisia, Turkey (Asia), Uganda, Viet Nam, Yemen,
> Zambia, Zimbabwe.
Good work generally, but regarding this last list...
Afghanistan has many languages in use (Pashto, Tajik, Hazara, Uzbek);
Algeria uses Arabic, Berber, and French; Jordan's official language
is Arabic (though the spoken one is a dialect); and generally so
Can you break this out by which languages we are missing, not just by
country, as country isn't specific enough?
-george william herbert
george.herbert at gmail.com
More information about the foundation-l