Hullo everyone.
I was asked by a volunteer for help getting stats on the gender gap in content on a certain Wikipedia, and came up with simple Wikidata Query Service[1] queries that pulled the total number of articles on a given Wikipedia about men and about women, to calculate *the proportion of articles about women out of all articles about humans*.
Then I was curious about how that wiki compared to other wikis, so I ran the queries on a bunch of languages, and gathered the results into a table, here: https://meta.wikimedia.org/wiki/User:Ijon/Content_gap
(please see the *caveat* there.)
I don't have time to fully write-up everything I find interesting in those results, but I will quickly point out the following:
1. The Nepali statistic is simply astonishing! There must be a story there. I'm keen on learning more about this, if anyone can shed light.
2. Evidently, ~13%-17% seems like a robust average of the proportion of articles about women among all biographies.
3. among the top 10 largest wikis, Japanese is the least imbalanced. Good job, Japanese Wikipedians! I wonder if you have a good sense of what drives this relatively better balance. (my instinctive guess is pop culture coverage.)
4. among the top 10 largest wikis, Russian is the most imbalanced.
5. I intend to re-generate these stats every two months or so, to eventually have some sense of trends and changes.
6. Your efforts, particularly on small-to-medium wikis, can really make a dent in these numbers! For example, it seems I am personally responsible[2] for almost 1% of the coverage of women on Hebrew Wikipedia! :)
7. I encourage you to share these numbers with your communities. Perhaps you'd like to overtake the wiki just above yours? :)
8. I'm happy to add additional languages to the table, by request. Or you can do it yourself, too. :)
A.
[1] https://query.wikidata.org/ [2] Yay #100wikidays :) https://meta.wikimedia.org/wiki/100wikidays