Very interesting; lists missing wikipedias in Indic languages too.
---------- Forwarded message ----------
From: Milos Rancic <millosh(a)gmail.com>
Date: Sat, Jun 25, 2011 at 10:22 AM
Subject: [Foundation-l] Languages and numbers
To: Wikimedia Foundation Mailing List <foundation-l(a)lists.wikimedia.org>
While preparing Missing Wikipedias , I've got numbers of speakers and
languages by area and country with chapter not covered by Wikipedias.
Numbers are preliminary, some of them should be corrected. I didn't
exclude Han languages, which mostly shouldn't be counted, and similar.
Note, also, that every language should be analyzed separately. Many
languages are spoken not just inside of one country.
Please, fix errors and comment.
* * *
Areas. They approximate the usual definitions of areas, but they are
different because of linguistic corrections.
* Afro-Asiatic Area: Area where Afro-Asiatic languages are dominant.
North Africa + Middle East + Sudan, Ethiopia, Eritrea and Somalia - Iran.
* Europe: Europe (including Caucasus) includes Turkey.
* South Asia: South Asia + Iran. Dominantly Indo-European and Dravidian
* Sub-Saharan Africa: The rest of Africa.
* Polynesia, Australia and Oceania: Includes Malaysia and Taiwan
(Taiwanese languages not covered in Wikipedias are dominantly Austronesian.)
* East Asia: Han China "China (Central)", Korea and Japan.
* South-East Asia: Includes non-Han south China "China (South)".
* Latin America: Parts of America where Spanish and Portuguese are
* Anglo-French America: Parts of America where English, French and Dutch
are official languages.
* North Asia: Asian part of former USSR, Mongolia and non-Han northern
and western China "China (North)".
The first column is number of speakers, the second number of languages,
the third is area.
399259294 592 South Asia
353676706 1805 Sub-Saharan Africa
221855457 253 Afro-Asiatic Area
138979263 2198 Polynesia, Australia and Oceania
107363760 37 East Asia
99260271 447 South-East Asia
47901185 143 Europe
30361602 724 Latin America
8481452 227 Anglo-French America
3724384 45 North Asia
* * *
Countries with chapters. (Numbers are not fully correct, as they include
some languages removed in the list below this one.)
If any chapter (or interested group) is interested in full list of
missing languages, I'll provide it by request before completing the
work. I suppose that some chapters are interested in languages with less
than 100K of speakers, as well.
296,097,274 349 India
71,356,176 681 Indonesia
46,676,395 157 Philippines
7,819,010 9 Germany
7,994,871 76 Russian Federation
5,386,580 5 Serbia
4,785,299 6 South Africa
2,841,300 17 Israel
1,139,750 4 Ukraine
1,085,931 125 United States
832,000 3 Netherlands
705,967 70 Canada
472,470 1 Czech Republic
375,704 17 Taiwan
313,642 6 Chile
246,900 3 United Kingdom
200,500 4 Spain
191,430 5 Poland
151,240 7 Sweden
132,809 12 Argentina
86,390 155 Australia
50,000 1 France
30,000 1 Hungary
29,980 4 Switzerland
17,460 5 Finland
15,000 1 Portugal
10,500 2 Norway
5,000 1 Denmark
4,500 1 Estonia
Languages with more than million or more than 100,000 of speakers
without Wikipedia and with chapter in the country:
India (more than million)
3633900 Konkani, Goan
2680000 Indian Sign Language
1950000 Gondi, Northern
1045000 Panjabi, Mirpur
1000000 Pahari, Mahasu
Indonesia (more than million)
2350000 Malay, Central
2000000 Batak Toba
1880000 Malay, Makassar
1200000 Batak Simalungun
1200000 Batak Dairi
1100000 Batak Mandailing
1000000 Malay, Jambi
Philippines (more than 100k)
2500000 Bicolano, Central
1900000 Bicolano, Albay
540000 Bontoc, Central
319000 Sama, Southern
234000 Bicolano, Iriga
185000 Sorsogon, Waray
150000 Blaan, Koronadal
140000 Subanen, Central
122000 Bicolano, Northern Catanduanes
100000 Philippine Sign Language
2000000 Saxon, Upper
460090 Mari, Meadow
Serbia and Kosovo
4156090 Albanian, Gheg
709570 Romani, Balkan
318920 Romani, Sinte
4101000 Sotho, Northern
1762320 Yiddish, Eastern
352500 Arabic, Judeo-Tunisian
258930 Arabic, Judeo-Moroccan
100130 Arabic, Judeo-Iraqi
600000 Hawai’i Creole English
250000 Sea Island Creole English
472470 Romani, Carpathian
102000 Spanish Sign Language
109600 Finnish, Tornedalen
foundation-l mailing list
We would like to share the cool stuffs you are doing for the wikimedia
Most of the time people want to help wikimedia projects; but dont know what
they can do.
So if are working on something and wants others to collaborate to that
article or work please share that.
Facebook page of WikimediaIndia : http://www.facebook.com/WikimediaIndia --
*anything related to Indic wikipedias or English WikiProject India*
Facebook page of English WikiProject India :
related to English WikiProject India*
You can email me or any admins of this projects.
To share the work on Wikipedia page
Please contact Kevin Gorman (communicationsintern(a)wikimedia.org)
In a recent mail to the google group regarding Google's Indic Wikipedia
translation project, Dimple Batra from Google reported:
"We are in the process of closing down the wikipedia indic language
We had initiated this project to bootstrap the creation of indic content and
encourage consumption in indic languages. Having accomplished the same over
the past 2 years we would now like the community to continue contributing.
Please feel free to reach out to us if you have other proposals on how we
can create more local content in India"
Please let us know how the different Indic Wikipedia communities involved in
this project are going to handle this.
As far as Tamil Wikipedia is concerned:
1000+ articles created using this project went through few rounds of
corrections by both Wikipedians and the paid translators. About 50% of them
were found to be of acceptable quality. We selected few translators who were
doing a better job and a test case of 25 articles were assigned based on
which further articles should be written. But, the collaboration between
Tamil Wikipedia and Google lost its momentum after this stage.
Frankly, Tamil Wikipedians were exhausted in this year long project with
more than 20 Wikipedians contributing for this. The good will between Google
- Translators - Wikipedians did not improve as the translators didn't show
interest in improving their previous articles any further until new articles
are assigned to them. (Which means continuous employment for them and profit
for the companies involved). Later, we came to know that most of the
translators involved in the project resigned from their respective companies
and expected a stand still in this project.
In this situation, the above reply from Google came for a query posted by
Arjuna Rao Chavala regarding the project's status in Telugu Wikipedia.
We haven't decided yet on the next course of action on the Google created
articles. My expectation is that they will be treated as regular articles
and the community will try to work on them. In case of very badly written
articles, part of the article may be moved to talk page and moved back again
after they are corrected. However, this will really be a time consuming
exercise for sure.
My personal opinion is that I am very much disappointed. Google's intention
was good but execution left very much to desire. It didn't show interest to
talk to the community before starting and before ending the project abruptly
on its own.
We are amid of an ongoing description of changing the current description of
The present description is as below-
ଓଡ଼ିଆ ଉଇକିପିଡ଼ିଆ (Odia Wikipedia)
ଗୋଟିଏ ଖୋଲା ଭାଷାକୋଷ (Eka khola bhasakosa meaning An Open Encyclopedia)
But as most suggesting the name "Gyanamandala" or "Gyana kosa", I just want
to ask one thing, is that fine if we change the description from "Bhasakosa"
But, here is a question! A friend has mentioned that "Gyanamandala" is the
name of an old encyclopedia. *(Published during 1961-1988 by Binod Kanungo).
* But "Gyanamandala" is a generic name in Odia Language which means
"Encyclopedia", So, can we use this word in the description part.
PS This is the text which appears below the name "Wikipedia" in Odia
Wikipedia logo (http://or.wikipedia.org )
Please give your suggestions.
Wikipedia Odia (Oriya) mailing list
I was wondering is anyone here could offer some help to the Chapter to
set up a system to process membership forms etc.
Essentially, what we need is:
1. A front end for the applicant enter form details.
2. A way to update these details with payment information, etc. as and
2. A database to store that.
3. A way to generate a:
3.2 Membership Number
3.3 Membership Card
3.4 Send an email when it is posted
3.5 A PDF of the form so entered for the applicant to sign and send.
4. Add the member to a mailing list
5. Send reminder email as and when necessary
6. It would be nice to have:
6.1 A way for members to update and edit their details as and when it changes
7. Open source software will be preferred.
Wikimedia India Programs
skype : hisham.wikimedia
gtalk : hmundol(a)wikimedia.org
twitter : @mundol
On Jun 23, 2011, at 4:05 PM, Federico Leva (Nemo) wrote:
> Hisham Mundol, 23/06/2011 11:54:
>> T*here are 19,000 laptops*that will be distributed in this initiative.
>> (These are on Ubuntu Linux - and very reasonably configured.)
> How much disk space do they have? It shouldn't be too difficult to add a Kiwix installation with English language (and indic languages) Wikipedias ZIMs, if there's space enough.
320GB so space isn't a problem.
>> Assam has traditionally had a problem with infrastructure and Internet
>> access is a problem. Someone who is supporting Amtron has asked the
>> Foundation if we can give them an offline version of Wikipedia to
>> pre-load onto these computers. Given that it is for 15-16 age group, it
>> does need to be of appropriate content.
> Uh? What sort of Wikipedia content could ever be unappropriate for 15-16 years old boys and girl? I can't imagine any. :-/
:-) Would suggest we err on the side of caution. majorly on the side of caution.
For those who don't know Shyamal , an active Wikipedian from India, read on
( Fwding this mail with the permission of AshLin)
---------- Forwarded message ----------
From: Ashwin Baindur
Date: Wed, Jun 22, 2011 at 5:15 PM
Subject: [WikiConference-India 2011] First speaker lined up!
To: For organising national WikiConferences in India <
Short on the heels of our first tentative sponsor is our tentative first
speaker. On the talk page of the schedule on Meta:, I had suggested that we
could schedule a talk on ''Indian Biodiversity in Wikipedia - Issues,
challenges & the way ahead'' by Shyamal.
He has agreed to my proposal.
The Indian biodiversity scene has a number of people who work on Wikipedia
but rarely venture out in Outreach and as such remain completely unknown to
the general Wikipedia community. Shyamal is such a guy. So I will introduce
you to him.
An editor since 26 November 2002, Shyamal has an edit count of over 36,000
edits on English Wikipedia, is an administrator on English Wikipedia and is
quite simply India's most prolific contributor on Nature and Wild life.
Respected amongst the ornithological community in India, where he wrote a
seminal paper in 2007 on "Whither Indian Ornithology" in a peer-reviewed
journal, Indian Birds,
...he has contributed over 5200 images to Commons, many of them carefully
retrieved from nineteenth century books and cleaned up and loaded as well as
svg images painstakingly crafted
He has theorised about the role of Open Science in India, and commented on
influential papers on Wikipedia in Science.
Shyamal has also loaded hundreds of documents into www.archive.org where
they are now available to posterity.
When he reads this, Shyamal will once again accuse me of being a hagiograph.
Well, as a fauji, I like to call a spade a spade. Just thought, you'd like
The Indian community has many unsung WikiHeroes, one of whom is Shyamal.
You will not receive any more courtesy notices from our members
for two days. Messages you have sent will remain in a lower
priority mailbox for our member to review at their leisure.
Future messages will be more likely to be viewed if you are on
our member's priority Guest List.
Powered by Boxbe -- "End Email Overload"