Hello,
With all admiration for the maths, I think that we can learn from these figures less than we might hope to. In these statistics I often see a strangly high proportion of traffic from the US or other countries that is difficult to explain. Why, for example, should there be to many people in the US who are interested in Frisian Wikipedia?
Even if the numbers and proportions are right: there are too many factors to consider.
Some years ago I did some research to single Wikipedia language versions, and it still seems to be the most useful way to combine several methods. Very important are interviews with the local Wikipedians. It would be great to have more interviews with readers or potential readers (or potential non-readers) in order to find out why a Wikipedia language version does grow, or not.
Kind regards Ziko
Am Montag, 16. März 2015 schrieb Oliver Keyes :
Awesome work! It's interesting to see Finnish as the outlier here. Do we have any fi-users on the list who can comment on this and might know what's going on? (And, in the absence of Finns: Jan, heard anything from across the border? :p)
The only caution I'd raise is that these numbers don't include spider filtering. Why is this important? Well, a lot of traffic is driven by crawlers and spiders and automata, particularly on smaller projects, and it can lead to weirdness as a result. With the granular pagecount files there's some work that can be done to detect this (for example, using burst detection and a few heuristics around concentration measures to eliminate pages that are clearly driven by automated traffic - see the recent analytics mailing list thread) but only some. I appreciate this is a flaw in the data we are releasing, not in your work, which is an excellent read and highly interesting :). I agree that understanding the lack of development in the PRC and ROK is crucial - we keep talking about the "next billion readers" but only talking :(
On 16 March 2015 at 02:21, h <hanteng@gmail.com javascript:;> wrote:
Dear all,
I have some findings to show the page views per Internet user
measurement may help comparing different language editions of Wikipedia. Criticism and suggestions are welcome.
http://people.oii.ox.ac.uk/hanteng/2015/03/15/comparing-language-development...
Which language version of Wikipedia enjoys the most page views per
language
Internet user than expected? It is Finnish. In terms of absolute positive and negative gap, English has the widest positive gap whereas Chinese has the largest negative gap.
......
In particular, it is known that Wikipedia (and Google which often favours Wikipedia) faces local competition in the People's Republic of China and South Korea. Therefore it is understandable the page views may be lower
in
Chinese and Korean Wikipedia language projects simply because some users' need to read user-generated encyclopedias are satisfied by other
websites.
However, it remains an important question to examine why these particular Latin and Asian languages are under-developed for Wikipedia projects.
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org javascript:; https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
-- Oliver Keyes Research Analyst Wikimedia Foundation
Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org javascript:; https://lists.wikimedia.org/mailman/listinfo/wiki-research-l