I work for a consulting firm called Strategy&. We have been engaged by Facebook on behalf of Internet.org to conduct a study on assessing the state of connectivity globally. One key area of focus is the availability of relevant online content. We are using a the availability of encyclopedic knowledge in one's primary language as a proxy for relevant content. We define this as 100K+ Wikipedia articles in one's primary language. We have a few questions related to this analysis prior to publishing it:
* We are currently using the article count by language based on Wikimedia's foundation public link: Source: http://meta.wikimedia.org/wiki/List_of_Wikipedias. Is this a reliable source for article count - does it include stubs?
* Is it possible to get historic data for article count. It would be great to monitor the evolution of the metric we have defined over time?
* What are the biggest drivers you've seen for step change in the number of articles (e.g., number of active admins, machine translation, etc.)
* We had to map Wikipedia language codes to ISO 639-3 language codes in Ethnologue (source we are using for primary language data). The 2 language code for a wikipedia language in the "List of Wikipedias" sometimes matches but not always the ISO 639-1 code. Is there an easy way to do the mapping?
[Description: Strategy& Logo]
Formerly Booz & Company
Rawia Abdel Samad
Direct: +9611985655 | Mobile: +97455153807
I'm inquiring about the delay for publishing the January compressed Wikistats files that are maintained by Erik Zachte. I'm guessing those processes are given a low priority compared to the content backups that need to run. More generally, I'm interested in finding new ways that I can help out. I'm an ex-Microsoftie who is now on the fraud analytics team at TD Bank. I've been involved with the Wikimedia group in Atlanta. I organize the picnic each summer, and helped get the rest of the historic buildings photographed. I've dabbled in reverting vandalism, and I contribute to articles when I actually have something to contribute. I don't feel like I've settled into a contributor role that really fits me yet though.
I enjoy using a variety of the traffic data sets that Wikimedia publishes. It seems the traffic servers get bogged down sometimes though. Can I help? Should I try to get the Atlanta group to pool our donations this year for an extra computer?
My username is rbaasland and I would like to contribute to the analytics
project. I was wondering if I could have access to the project, or how I go
about contributing to this project?
Thank you very much,
analytics-store tmp space filled up today with many large temporary
tables (it was ~32G) from many slow research queries. Those had to be
killed, the database process restarted, and tmp space expanded.
It's back up now.
DBA @ WMF
I am thrilled to announce our speaker lineup for this month’s research showcase <https://www.mediawiki.org/wiki/Analytics/Research_and_Data/Showcase#April_2…>.
Jeff Nickerson (Stevens Institute of Technology) will talk about remix and reuse in collaborative communities; Heather Ford (Oxford Internet Institute) will present an overview of the oral citations debate in the English Wikipedia.
The showcase will be recorded and publicly streamed at 11.30 PT on Thursday, April 30 (livestream link will follow). We’ll hold a discussion and take questions from remote attendees via the Wikimedia Research IRC channel (#wikimedia-research <http://webchat.freenode.net/?channels=wikimedia-research> on freenode) as usual.
Looking forward to seeing you there.
Creating, remixing, and planning in open online communities
Paradoxically, users in remixing communities don’t remix very much. But an analysis of one remix community, Thingiverse, shows that those who actively remix end up producing work that is in turn more likely to remixed. What does this suggest about Wikipedia editing? Wikipedia allows more types of contribution, because creating and editing pages are done in a planning context: plans are discussed on particular loci, including project talk pages. Plans on project talk pages lead to both creation and editing; some editors specialize in making article changes and others, who tend to have more experience, focus on planning rather than acting. Contributions can happen at the level of the article and also at a series of meta levels. Some patterns of behavior – with respect to creating versus editing and acting versus planning – are likely to lead to more sustained engagement and to higher quality work. Experiments are proposed to test these conjectures.
Authority, power and culture on Wikipedia: The oral citations debate
In 2011, Wikimedia Foundation Advisory Board member, Achal Prabhala was funded by the WMF to run a project called 'People are knowledge' or the Oral citations project <https://meta.wikimedia.org/wiki/Research:Oral_Citations>. The goal of the project was to respond to the dearth of published material about topics of relevance to communities in the developing world and, although the majority of articles in languages other than English remain intact, the English editions of these articles have had their oral citations removed. I ask why this happened, what the policy implications are for oral citations generally, and what steps can be taken in the future to respond to the problem that this project (and more recent versions of it <https://meta.wikimedia.org/wiki/Research:Indigenous_Knowledge>) set out to solve. This talk comes out of an ethnographic project in which I have interviewed some of the actors involved in the original oral citations project, including the majority of editors of the surr <https://en.wikipedia.org/wiki/surr> article that I trace in a chapter of my PhD <http://www.oii.ox.ac.uk/people/?id=286>.