Awesome work! It's interesting to see Finnish as the outlier here. Do
we have any fi-users on the list who can comment on this and might
know what's going on? (And, in the absence of Finns: Jan, heard
anything from across the border? :p)
The only caution I'd raise is that these numbers don't include spider
filtering. Why is this important? Well, a lot of traffic is driven by
crawlers and spiders and automata, particularly on smaller projects,
and it can lead to weirdness as a result. With the granular pagecount
files there's some work that can be done to detect this (for example,
using burst detection and a few heuristics around concentration
measures to eliminate pages that are clearly driven by automated
traffic - see the recent analytics mailing list thread) but only some.
I appreciate this is a flaw in the data we are releasing, not in your
work, which is an excellent read and highly interesting :). I agree
that understanding the lack of development in the PRC and ROK is
crucial - we keep talking about the "next billion readers" but only
talking :(
On 16 March 2015 at 02:21, h <hanteng(a)gmail.com> wrote:
Dear all,
I have some findings to show the page views per Internet user
measurement may help comparing different language editions of Wikipedia.
Criticism and suggestions are welcome.
-----
http://people.oii.ox.ac.uk/hanteng/2015/03/15/comparing-language-developmenā¦
Which language version of Wikipedia enjoys the most page views per language
Internet user than expected? It is Finnish. In terms of absolute positive
and negative gap, English has the widest positive gap whereas Chinese has
the largest negative gap.
......
In particular, it is known that Wikipedia (and Google which often favours
Wikipedia) faces local competition in the People's Republic of China and
South Korea. Therefore it is understandable the page views may be lower in
Chinese and Korean Wikipedia language projects simply because some users'
need to read user-generated encyclopedias are satisfied by other websites.
However, it remains an important question to examine why these particular
Latin and Asian languages are under-developed for Wikipedia projects.
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
--
Oliver Keyes
Research Analyst
Wikimedia Foundation