On Thu, Aug 20, 2009 at 9:22 PM, Lars Aronssonlars@aronsson.se wrote:
Kaare Olsen wrote:
What I think is the primary reason for the Danish Wikipedia being much smaller than the "neighbouring" languages is that Danes generally are internationally minded and pride themselves on being good at English - people may simply prefer to use/edit Wikipedia in that language (even I did that when first attracted to Wikipedia).
I find it hard to believe that this would be a major difference between Denmark and Sweden. But it would be really interesting if we could somehow trace the use of the English Wikipedia to users of various mother tongues (for Northern Europe, country or IP address range might be a good enough approximation for mother tongue). Perhaps Swedish users stay on the Swedish Wikipedia to read about sports, but go to the English to read about music.
For each IP address range, we could (well, Domas could) analyze which language of Wikipedia those users primarily go to. If users from 130.236.xxx.yyy mostly visit the English and Swedish Wikipedia, we can assume that it constitutes a Swedish-speaking community. If no conclusive pattern is shown on the /16 (class B) range, each /24 (class C) net can be analyzed individually.
I published a very simple GEO vs Project readership report a couple of years back. I could dig up the data, but it's old now. It's not terribly hard to run, and the old script should still work.
It was generally the case that for much of the world English Wikipedia was accessed Wikipedia by readers with roughly comparable frequency to the 'expected' language, and in some cases far more so… though there were some significant exceptions: For example the Italians stuck to itwiki and the Japanese stuck to jawiki. Much of Europe was more mixed.
There is also this old data: http://meta.wikimedia.org/wiki/Edits_by_project_and_country_of_origin
How many messages need to be translated to make mediawiki basically usable? My own belief was that you only needed a few dozens to make the software basically usable, at least enough to bootstrap usage.