We are releasing the alpha version of Wikidata Knowledge Imbalance
Dashboard (
https://prowd.netlify.com/). The tool measures knowledge
imbalances on Wikidata using Gini index based on property existence over
entities.
In the examples, data about infectious diseases [1] is shown to be
imbalanced (Gini = 0.26), whereas data about countries [2] is balanced
(Gini = 0.14). Other examples include programming languages [3] (Gini =
0.37) and association football clubs [4] (Gini = 0.29).
The tool also supports the scenario where instead of analyzing all possible
properties of entities, one may focus on specific properties of interest.
With respect to diseases [5], for example, analyzing the existence of the
properties has effect (P1542), possible treatment (P924), drug used for
treatment (P2176), and symptoms (P780), reveals that Wikidata is heavily
imbalanced.
As the tool is still in a preliminary stage, we invite you to give it a try
at
https://prowd.netlify.com/ , and welcome any constructive feedback!
Regards,
Fariz
Example links:
[1]
https://commons.wikimedia.org/wiki/File:Screenshot-prowd.netlify.com-2020.0…
[2]
https://commons.wikimedia.org/wiki/File:Screenshot-prowd.netlify.com-2020.0…
[3]
https://commons.wikimedia.org/wiki/File:Screenshot-prowd.netlify.com-2020.0…
[4]
https://commons.wikimedia.org/wiki/File:Screenshot-prowd.netlify.com-2020.0…
[5]
https://commons.wikimedia.org/wiki/File:Screenshot-prowd.netlify.com-2020.0…