I would have to say that it might be realistic to gather statistics for a small event held
in one location but for any large event, with multiple physical locations and/or online
participation, it would be very difficult. Having been involved in supporting real-world
events, I am well aware that many people believe the organisers have nothing else to do
but gather statistics. In fact, you are running around like a headless chook the whole day
because there are so many things to be done to run the event at all and there are usually
too few organisers/helpers relative to the number of mostly newbie participants, so
statistics gathering is the last thing on your mind.
Also some people are contributing because they are participants of the editathon but there
are also contributions (both helpful and unhelpful) from other members of the community
who are just reacting in their usual way to Wikipedia contributions and may not regard
themselves as part of the editathon and/or may be completely unaware of it.
As a concrete example, Wikibomb 2014 (an editathon aimed at creating articles for
Australian female scientists selected by the Australian Academy of Science) had multiple
physical sites in different cities plus on-line participants and took place on a single
day. (I was at one of the physical locations and too busy to count the participants but
around 30 people). We asked that participants add the category
https://en.wikipedia.org/wiki/Category:Wikibomb2014
to their articles (but obviously we cannot be sure if they did, most were new to Wikipedia
editing and may not have even understood what we were asking them to do). However, using
the category, we do have a set of 118 articles that we know were created as part of the
event (although it may be that some were created in advance or after the event but still
used the category, but presumably were part of the event in terms of intent) and some may
have been deleted subsequently (we had issues with sources being the university or
research institute employing the scientist so perhaps questionable as to their
independence, plus we had copyvios where bios from university websites were
copy-and-pasted etc).
From that set of 118 articles, you can probably analyse
their edit histories and find the list of contributors in the first day or so, which
should pick up most of the event participants (but also some others). You cannot rely on
the first edit being the original participant. I often did the first edit to create the
article if people were being diverted into Article for Creation (tip: never use AfC at an
event, the success of an event needs immediately visible articles at the end of the day
which is not possible with AfC), so first edit may be done by experienced editors as a
matter of practicality. But with a certain amount of visible inspection, you would
probably be able to identify the person who contributed the most article text on that day
and that person would probably be a participant for the event. You might be able to
automate that.
Kerry
From: Wiki-research-l [mailto:wiki-research-l-bounces@lists.wikimedia.org] On Behalf Of
Jonathan Morgan
Sent: Wednesday, 9 December 2015 3:46 AM
To: Research into Wikimedia content and communities
<wiki-research-l(a)lists.wikimedia.org>
Cc: Harsh Gupta <gupta.harsh96(a)gmail.com>
Subject: Re: [Wiki-research-l] Data on editathons held in each Wikipedia Language?
I don't personally know of any central repository for data on past edit-a-thons.
There might be something out there. You could probably get some information from pinging
folks in CE who've worked on Project & Event Grants (Asaf Bartov, Kacie Harold) or
Program Evaluation (Amanda Bittaker, Edward Galvez), or search through past grant
reports... but I'm guessing the data will be sparse and inconsistent, as it is still
collected in a somewhat ad-hoc fashion.
If WMF were to support the development and maintenance of standardized infrastructure for
edit-a-thon tracking--something like Harsh Kothari and Jeph Paul's platform for the
Indian Wikiwomen edit-a-thons (site <http://2015.wikiwomen.in/> , code
<https://github.com/cosmiclattes/wikiwomen/tree/master> )--this would be easier. But
AFAIK that hasn't happened. If someone takes up that cause I will voice my support.
J
On Mon, Dec 7, 2015 at 7:34 PM, Maximilian Klein <isalix(a)gmail.com
<mailto:isalix@gmail.com> > wrote:
Researchians,
I have a been collecting data on the gendered biographies of different Wikipedia Languages
from Wikidata dumps, with the question of trying to understand the gender gap in content.
After reading about Propensity Score Matching[1] today, I see it would be possible to test
a (close to) causal link between the genders of Wikipedia Biographies being added to a
language, and Editathon activity. Yet we'd need the data for editathon activity. Is it
compiled somewhere, or can you think of how it could be compiled?
[1]
https://en.wikipedia.org/wiki/Propensity_score_matching The idea in propensity score
matching is to pretend a randomized experiment is being conducted, and to find a
"control group" - a similar but untreated language, for each "treated
group".
Make a great day,
Max Klein ‽
http://notconfusing.com/
_______________________________________________
Wiki-research-l mailing list
Wiki-research-l(a)lists.wikimedia.org <mailto:Wiki-research-l@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>