[Foundation-l] Where do our readers come from? Q&A

Mark Williamson node.ue at gmail.com
Sat Jan 16 09:40:06 UTC 2010


Sociolinguistic situations around the world are very complex I think. In
especially former European colonies, of which Kenya is but one example, the
language of the former colonial power often has a unique position in
society.

It is not surprising to me that the English Wikipedia is so popular compared
to any other in Kenya, but it is quite a bit more surprising that Korean,
Romanian, Bulgarian, Lithuanian, Iranian, etc. users prefer the English
Wikipedia.

Mark

On Sat, Jan 16, 2010 at 2:25 AM, Ziko van Dijk <zvandijk at googlemail.com>wrote:

> Dear Erik,
>
> Maybe there is a dirty Polish word looked up by many Polish pupils,
> and when they Google it they come to eu.WP because a Basque word
> accidentally is alike? :-)
>
> I am looking now for the interest in the native / the English
> Wikipedia in specific countries. It might be important how localized
> the software in general is. If you live in, say, Kenya, and your
> computer has Windows in English, the Internet Explorer and everything
> is oriented to English, and you google your home town in an English
> language Google, it is probable that you will get the Wikipedia
> article in English and not in Swahili.
>
> Kind regards
> Ziko
>
>
> 2010/1/16 Mark Williamson <node.ue at gmail.com>:
> > I notice in that list both Belarusian Wikipedias are listed just as
> > "Belarusian Wikipedia". It would be very informative to know which is
> which
> > and to have visitor statistics on both :-)
> >
> > skype: node.ue
> >
> >
> > On Fri, Jan 15, 2010 at 3:39 PM, Erik Zachte <erikzachte at infodisiac.com
> >wrote:
> >
> >> Here is a Q&A on all issues raised:
> >> Q=question/R=Remark, A=answer
> >>
> >> I put the more general questions on top.
> >>
> >> Cheers, Erik Zachte
> >>
> >> ------------------------------------------
> >>
> >> Q: Nikola Smolenski
> >> Is it first time these reports are published?
> >>
> >> A:
> >> Yes, expect trend report to grow by accretion over time.
> >> Other reports will be built from data for recent (6) months only
> >>
> >> ------------------------------------------
> >>
> >> R: Andrew Gray
> >> Andrew explains why distribution of page requests over countries favors
> >> Spanish and Portuguese speaking countries:
> >> 'Some Wikipedias - the ones which insist on only-free-images - do not
> use
> >> local uploads at all.'
> >>
> >> A:
> >> Thanks for explaining this unexpected distribution of page views on
> >> Commons,
> >> I had no idea.
> >>
> >> Spain           30.0%
> >> USA             29.2%
> >> Brazil  8.5%
> >> Argentina       4.8%
> >> Mexico  3.9%
> >> Germany 3.3%
> >> France  2.1%
> >> Venezuela       1.9%
> >> Chile           1.4%
> >> Costa Rica      1.4%
> >> Italy           1.4%
> >> Uruguay 1.2%
> >> Colombia        1.2%
> >> Portugal        1.1%
> >>
> >> ------------------------------------------
> >>
> >> R: Mark Williamson
> >>
> >> Two main factors influencing choice of Wikipedia language:
> >> # Fluency of the Internet-using population of a country in English.
> >> # Quality of the native Wikipedia.
> >>
> >> A:
> >> Like you say. Many Scandinavians (and Dutch people I might add) probably
> >> switch between English and local content all the time.
> >> Personally I tend to look at English Wp first I many instances, because
> of
> >> obviously richer content and larger depth.
> >>
> >> ------------------------------------------
> >>
> >> Q: Ziko van Dijk
> >> Why are 40 % of the visitors of ksh.WP (the dialect of Cologne) from
> Japan.
> >> Why are 25 % of the visitors of eu.WP (Basque) from Poland?
> >>
> >> Q: Andre Engels
> >> I think bots are a likely explanation in the eu case
> >> (unless Erik is using an algorithm that filters out bots)
> >>
> >> A:
> >> KSH used to be code for Kashmir. Still not Japan, but much closer than
> >> Cologne.
> >> Maybe Japanese mountaineers caused this spike ? (only half kidding)
> >>
> >> As for eu.wp: Would Polish presume there also is a European Wikipedia?
> Just
> >> a guess.
> >>
> >> I do filter bots
> >>
> >> ------------------------------------------
> >>
> >> R: Teun Spaans
> >> For trends, I would expect a bar indicating upward or downward trend,
> not a
> >> percentage bar.
> >>
> >> A:
> >> We can have both, a notion of importance and of change: I might color
> code
> >> cells as I do already in e.g. [1]
> >> This way large fluctuations really stand out. Let's first collect more
> >> history.
> >>
> >> [1] http://stats.wikimedia.org/EN/TablesPageViewsMonthly.htm
> >>
> >>
> >> ------------------------------------------
> >>
> >> Q: Nikola Smolenski
> >> Could we get this for other projects?
> >>
> >> A:
> >> This question is of course not unexpected.
> >> One consideration is we need a certain sample size to make numbers
> >> significant.
> >> For other projects, with far less traffic, few country/language pairs
> would
> >> be backed by sufficient data.
> >> See also below on extending the current reports with more table rows.
> >>
> >> ------------------------------------------
> >>
> >> Q: Nikola Smolenski:
> >> Please include at Wikipedia Page Views Per Country - Overview [1] number
> of
> >> Internet users from [2], and number of views per Internet user?
> >>
> >> [1] http://tinyurl.com/yk43aq6
> >> [2] http://tinyurl.com/yfv5bwn
> >>
> >> A:
> >> Done
> >>
> >> ------------------------------------------
> >>
> >> R: Nikola Smolenski
> >> It is obvious why Slovene Wikipedia is highly visited in Sierra Leone,
> and
> >> Serbian in Suriname; URLs do matter :)
> >> Although, I don't understand why so much. I would expect this
> distribution
> >> by visitors, perhaps, but not by visits.
> >>
> >> A:
> >> Very interesting observation! So people from Sierra Leone try
> >> 'sl.wikipedia.org'.
> >> Why people from Surinam go to 'sr.wikimedia.org' is only slightly less
> >> obvious to me, but apparently is happens
> >>
> >> For countries with just a few hits in the sampled log the distinction
> >> between visitors and visits gets blurred.
> >>
> >> ------------------------------------------
> >>
> >> R: Andre Engels
> >> Ukrainian is not a small language by any means, yet Wikipedia visitors
> tend
> >> to be drawn to the Russian Wikipedia instead.
> >>
> >> A: Yes but article growth in Ukrainian Wikipedia has been speeding up in
> >> recent months. [1]
> >>
> >> [1] http://stats.wikimedia.org/EN/TablesWikipediaUK.htm
> >>
> >> ------------------------------------------
> >>
> >> R: Andre Engels
> >> The Q3-Q4 comparison for most countries shows a shift from English to
> the
> >> 'vernacular'.
> >>
> >> A:
> >> Interesting analysis. Let's see if this is a consistent trend.
> >> However the monthly page views per Wikipedia language for which we have
> 2
> >> year history do not show very significant shift from large to smaller
> >> wikipedia's.
> >> See table 'Distribution of page views' at bottom of page of [1]: smaller
> >> languages gain in share of page views, but very slowly.
> >>
> >> [1] http://stats.wikimedia.org/EN/TablesPageViewsMonthly.htm
> >>
> >> ------------------------------------------
> >>
> >> Q: Nikola Smolenski / Milos Rancic
> >> At Wikipedia Page Views By Country - Breakdown [1] and Wikipedia Page
> Views
> >> By Country - Trends [2] could you include more languages (ideally all
> >> languages)?
> >> Some of the numbers are going below 0.1% of population, but some of them
> >> are
> >> not mentioned even they are larger than 0.5% of population.
> >>
> >> [1] http://tinyurl.com/yhp3an7
> >> [2] http://tinyurl.com/yzga2hm
> >>
> >> A:
> >> Yes on some reports I do include smaller percentages for the largest
> >> Wikipedia's as those represent significant numbers of page views.
> >> I used different (and arbitrary) thresholds per report. The
> arbitrariness
> >> could change, but I want to plead for a notoriety threshold:
> >>
> >> Here is a much more extended version of the breakdown report [1] (for
> this
> >> discussion only)
> >> It shows per country up to 50 Wikipedia's
> >> An extra column shows the total number of records for this
> country/language
> >> (for the 6 month period) on which the percentage is based.
> >> As you can see for the smallest countries that number is so low that it
> is
> >> no longer significant.
> >>
> >> Let us say we cut off not at 1%, but at an (arbitrary) absolute
> threshold
> >> of
> >> x logged records per country/language pair (per row).
> >> Let us say we cut off at average 5 records per month. Everything below
> that
> >> threshold in the test report is in dark red.
> >> Personally I think this is still way too much detail for a general
> report.
> >> Not because of Kb's but information overload.
> >>
> >> [1] http://tinyurl.com/yjwoyre
> >>
> >>
> >>
> >>
> >> _______________________________________________
> >> foundation-l mailing list
> >> foundation-l at lists.wikimedia.org
> >> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >>
> > _______________________________________________
> > foundation-l mailing list
> > foundation-l at lists.wikimedia.org
> > Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
> >
>
>
>
> --
> Ziko van Dijk
> NL-Silvolde
>
> _______________________________________________
> foundation-l mailing list
> foundation-l at lists.wikimedia.org
> Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/foundation-l
>


More information about the foundation-l mailing list