Hello,
Awesome work! It's interesting to see Finnish as the outlier here. Do
we have any fi-users on the list who can comment on this and might
know what's going on? (And, in the absence of Finns: Jan, heard
anything from across the border? :p)
The only caution I'd raise is that these numbers don't include spider
filtering. Why is this important? Well, a lot of traffic is driven by
crawlers and spiders and automata, particularly on smaller projects,
and it can lead to weirdness as a result. With the granular pagecount
files there's some work that can be done to detect this (for example,
using burst detection and a few heuristics around concentration
measures to eliminate pages that are clearly driven by automated
traffic - see the recent analytics mailing list thread) but only some.
I appreciate this is a flaw in the data we are releasing, not in your
work, which is an excellent read and highly interesting :). I agree
that understanding the lack of development in the PRC and ROK is
crucial - we keep talking about the "next billion readers" but only
talking :(
On 16 March 2015 at 02:21, h <hanteng@gmail.com> wrote:
> Dear all,
>
> I have some findings to show the page views per Internet user
> measurement may help comparing different language editions of Wikipedia.
> Criticism and suggestions are welcome.
>
>
> -----
> http://people.oii.ox.ac.uk/hanteng/2015/03/15/comparing-language-development-in-wikipedia-in-terms-of-page-views-per-internet-users/
>
> Which language version of Wikipedia enjoys the most page views per language
> Internet user than expected? It is Finnish. In terms of absolute positive
> and negative gap, English has the widest positive gap whereas Chinese has
> the largest negative gap.
>
> ......
>
> In particular, it is known that Wikipedia (and Google which often favours
> Wikipedia) faces local competition in the People's Republic of China and
> South Korea. Therefore it is understandable the page views may be lower in
> Chinese and Korean Wikipedia language projects simply because some users'
> need to read user-generated encyclopedias are satisfied by other websites.
> However, it remains an important question to examine why these particular
> Latin and Asian languages are under-developed for Wikipedia projects.
>
> _______________________________________________
> Wiki-research-l mailing list
> Wiki-research-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
--
Oliver Keyes
Research Analyst
Wikimedia Foundation
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l