Researchians,

I have a been collecting data on the gendered biographies of different Wikipedia Languages from Wikidata dumps, with the question of trying to understand the gender gap in content. After reading about Propensity Score Matching[1] today, I see it would be possible to test a (close to) causal link between the genders of Wikipedia Biographies being added to a language, and Editathon activity. Yet we'd need the data for editathon activity. Is it compiled somewhere, or can you think of how it could be compiled?

[1] https://en.wikipedia.org/wiki/Propensity_score_matching The idea in propensity score matching is to pretend a randomized experiment is being conducted, and to find a "control group" - a similar but untreated language, for each "treated group".


Make a great day,
Max Klein ‽ http://notconfusing.com/