Dear any Wikidata Query Service expert,
In connection with an editathon, I have made statistics of the number of women and men on the Danish Wikipedia. I have used WDQS for that and the query is listed below:
SELECT ?count ?gender ?genderLabel WITH { SELECT ?gender (COUNT(*) AS ?count) WHERE { ?item wdt:P31 wd:Q5 . ?item wdt:P21 ?gender . ?article schema:about ?item. ?article schema:isPartOf https://da.wikipedia.org/ } GROUP BY ?gender } AS %results WHERE { INCLUDE %results SERVICE wikibase:label { bd:serviceParam wikibase:language "da,en". } } ORDER BY DESC(?count) LIMIT 25
As the statistics could potentially create some discussion (and ready seems to have) I am wondering whether there are some experts that could peer review the SPARQL query and tell me if there are any issues. I hope I have not made a blunder...
The minor issues I can think of are:
- Missing gender in Wikidata. We have around 360 of these.
- People on the Danish Wikipedia not on Wikidata. Probably tens-ish or hundreds-ish!?
- People not being humans. The gendered items I sampled were all fictional humans.
We previously reached 17.2% females. Now we are below 17% due to mass-import of Japanese football players, - as far as we can see.
best regards Finn Årup Nielsen http://people.compute.dtu.dk/faan/