[Foundation-l] 1.3 billion of humans don't have Wikipedia in their native language

Milos Rancic millosh at gmail.com
Sun May 22 11:15:34 UTC 2011

I am preparing document for Wikimania. Presently, I am in process of
analyzing data (SIL [1], Ethnologue [2], Wikimedia projects). I am using
Ethnologue data for population estimates.

Before I started this task, I thought that the situation is not so bad
(or good, if it is about possibility for development). I thought that we
are around the end of languages with more than 1M of speakers. However,
this is far from being true.

There are no Wikipedias in 243 languages with more than 1M of speakers.
Of those, 27 have more than 10M of speakers.

The biggest language without any Wikimedia project is Jin Chinese, with
45 millions of speakers.

Around 1 billion of people belong to the group of big languages without
Wikipedia (or any Wikimedia project) in their language.

Of those, 480 millions have test projects, but 550 millions don't have
even test project; including:

* Jin Chinese, 45M, China
* Haryanvi, 38M, India, incubator
* Xiang Chinese, 36, China, incubator
* Maithili, 34M, India, incubator
* Nigerian Pidgin, 30M, Nigeria, incubator
* Filipino, 25M, Philippines, incubator
* Chhattisgarhi, 17.5M, India, incubator
* Rangpuri, 15M, Bangladesh
* Seraiki, 13.8M, Pakistan, incubator
* Madura, 13.6M, Indonesia, incubator
* Haryanvi, 13M, India
* Deccan, 12.8M, India
* Malvi, 10.4M, India
* Min Bei Chinese, 10.3M, China, incubator
* Sylheti, 10.3M, Bangladesh

Around 300 millions of people are using languages with less than 1M of
speakers which don't have Wikipedia editions.

Note that for all languages in the world Ethnologue gives the number of
6.15 billion, which is pretty accurate, counting that current estimate
(according to Wikipedia [3]) is 6.92 billion and that counting speakers
is very different from counting official population statistics.

Those are preliminary results. We have two chapters (and strategic
focus) in countries of the list above. Inside of the longer list, which
should be verified, we have more chapters. I noted that there are even
two languages of Germany without Wikipedia, but with more than million
of speakers: Mainfränkisch and Upper Saxon (the later one without test

The list of countries with languages with more than 1M of speakers and
without Wikipedia is: Afghanistan, Algeria, Angola, Bangladesh, Benin,
Bolivia, Brazil, Burkina Faso, Cameroon, Chad, China, Congo, Côte
d’Ivoire, Democratic Republic of the Congo, Ecuador, Egypt, Equatorial
Guinea, Eritrea, Ethiopia, Germany, Ghana, Guatemala, Guinea, India,
Indonesia (Java and Bali), Indonesia (Kalimantan), Indonesia (Nusa
Tenggara), Indonesia (Sulawesi), Indonesia (Sumatra), Iran, Iraq,
Jamaica, Jordan, Kenya, Libya, Madagascar, Malawi, Malaysia
(Peninsular), Mali, Mauritania, Morocco, Mozambique, Myanmar, Namibia,
Niger, Nigeria, Pakistan, Paraguay, Peru, Philippines, Saudi Arabia,
Senegal, Serbia, Sierra Leone, Somalia, South Africa, Sudan, Syria,
Tanzania, Thailand, Tunisia, Turkey (Asia), Uganda, Viet Nam, Yemen,
Zambia, Zimbabwe.

[1] http://www.sil.org/
[2] http://www.ethnologue.com/
[3] http://en.wikipedia.org/wiki/Human_population

More information about the foundation-l mailing list