Hello,
I work for a consulting firm called Strategy&. We have been engaged by Facebook on
behalf of
Internet.org to conduct a study on assessing the state of connectivity globally.
One key area of focus is the availability of relevant online content. We are using a the
availability of encyclopedic knowledge in one's primary language as a proxy for
relevant content. We define this as 100K+ Wikipedia articles in one's primary
language. We have a few questions related to this analysis prior to publishing it:
* We are currently using the article count by language based on Wikimedia's
foundation public link: Source:
http://meta.wikimedia.org/wiki/List_of_Wikipedias. Is this
a reliable source for article count - does it include stubs?
* Is it possible to get historic data for article count. It would be great to
monitor the evolution of the metric we have defined over time?
* What are the biggest drivers you've seen for step change in the number of
articles (e.g., number of active admins, machine translation, etc.)
* We had to map Wikipedia language codes to ISO 639-3 language codes in Ethnologue
(source we are using for primary language data). The 2 language code for a wikipedia
language in the "List of Wikipedias" sometimes matches but not always the ISO
639-1 code. Is there an easy way to do the mapping?
Many Thanks,
Rawia
[Description: Strategy& Logo]
Formerly Booz & Company
Rawia Abdel Samad
Direct: +9611985655 | Mobile: +97455153807
Email:
Rawia.AbdelSamad@strategyand.pwc.com<mailto:Rawia.AbdelSamad@strategyand.pwc.com>
www.strategyand.com