Dear any Wikidata Query Service expert,
In connection with an editathon, I have made statistics of the number of
women and men on the Danish Wikipedia. I have used WDQS for that and the
query is listed below:
SELECT ?count ?gender ?genderLabel
WITH {
SELECT ?gender (COUNT(*) AS ?count) WHERE {
?item wdt:P31 wd:Q5 .
?item wdt:P21 ?gender .
?article schema:about ?item.
?article schema:isPartOf <https://da.wikipedia.org/>
}
GROUP BY ?gender
} AS %results
WHERE {
INCLUDE %results
SERVICE wikibase:label { bd:serviceParam wikibase:language "da,en". }
}
ORDER BY DESC(?count)
LIMIT 25
http://tinyurl.com/y8twboe5
As the statistics could potentially create some discussion (and ready
seems to have) I am wondering whether there are some experts that could
peer review the SPARQL query and tell me if there are any issues. I hope
I have not made a blunder...
The minor issues I can think of are:
- Missing gender in Wikidata. We have around 360 of these.
- People on the Danish Wikipedia not on Wikidata. Probably tens-ish or
hundreds-ish!?
- People not being humans. The gendered items I sampled were all
fictional humans.
We previously reached 17.2% females. Now we are below 17% due to
mass-import of Japanese football players, - as far as we can see.
best regards
Finn Årup Nielsen
http://people.compute.dtu.dk/faan/
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata