Hello Fariz, many thanks for the nice tool, useful also to find which properties are probably missed on which entities. If you want to work on it further: I think, the graph is a bit inexact, maybe just at the 10%-percentile. One can compare with https://en.wikipedia.org/wiki/Gini_coefficient#/media/File:Lorenz_curve_glob...
Best regards Michal
________________________________ Date: Tue, 24 Mar 2020 05:31:29 +0700 From: Fariz Darari fadirra@gmail.com To: "Discussion list for the Wikidata project." wikidata@lists.wikimedia.org Subject: [Wikidata] Wikidata knowledge imbalance dashboard - Alpha release Message-ID: CAN0WnMHDUjdR-gtf=enRP+mSp7gcWTekZFA=31Ot6nud5JU+0Q@mail.gmail.com Content-Type: text/plain; charset="utf-8"
We are releasing the alpha version of Wikidata Knowledge Imbalance Dashboard (https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fprowd.netl...). The tool measures knowledge imbalances on Wikidata using Gini index based on property existence over entities.
In the examples, data about infectious diseases [1] is shown to be imbalanced (Gini = 0.26), whereas data about countries [2] is balanced (Gini = 0.14). Other examples include programming languages [3] (Gini = 0.37) and association football clubs [4] (Gini = 0.29).
The tool also supports the scenario where instead of analyzing all possible properties of entities, one may focus on specific properties of interest. With respect to diseases [5], for example, analyzing the existence of the properties has effect (P1542), possible treatment (P924), drug used for treatment (P2176), and symptoms (P780), reveals that Wikidata is heavily imbalanced.
As the tool is still in a preliminary stage, we invite you to give it a try at https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fprowd.netl... , and welcome any constructive feedback!
Regards, Fariz
Example links: [1] https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommons.wi... [2] https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommons.wi... [3] https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommons.wi... [4] https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommons.wi... [5] https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommons.wi...
Hi Michal,
thanks for the suggestion!
We fixed the Gini graph as suggested.
Regards, Fariz
On Fri, Apr 3, 2020 at 8:17 PM Pavlovic, Michal < Michal.Pavlovic@newayselectronics.com> wrote:
Hello Fariz, many thanks for the nice tool, useful also to find which properties are probably missed on which entities. If you want to work on it further: I think, the graph is a bit inexact, maybe just at the 10%-percentile. One can compare with
https://en.wikipedia.org/wiki/Gini_coefficient#/media/File:Lorenz_curve_glob...
Best regards Michal
Date: Tue, 24 Mar 2020 05:31:29 +0700 From: Fariz Darari fadirra@gmail.com To: "Discussion list for the Wikidata project." wikidata@lists.wikimedia.org Subject: [Wikidata] Wikidata knowledge imbalance dashboard - Alpha release Message-ID: <CAN0WnMHDUjdR-gtf=enRP+mSp7gcWTekZFA= 31Ot6nud5JU+0Q@mail.gmail.com> Content-Type: text/plain; charset="utf-8"
We are releasing the alpha version of Wikidata Knowledge Imbalance Dashboard ( https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fprowd.netl...). The tool measures knowledge imbalances on Wikidata using Gini index based on property existence over entities.
In the examples, data about infectious diseases [1] is shown to be imbalanced (Gini = 0.26), whereas data about countries [2] is balanced (Gini = 0.14). Other examples include programming languages [3] (Gini = 0.37) and association football clubs [4] (Gini = 0.29).
The tool also supports the scenario where instead of analyzing all possible properties of entities, one may focus on specific properties of interest. With respect to diseases [5], for example, analyzing the existence of the properties has effect (P1542), possible treatment (P924), drug used for treatment (P2176), and symptoms (P780), reveals that Wikidata is heavily imbalanced.
As the tool is still in a preliminary stage, we invite you to give it a try at https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fprowd.netl... , and welcome any constructive feedback!
Regards, Fariz
Example links: [1]
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommons.wi... [2]
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommons.wi... [3]
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommons.wi... [4]
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommons.wi... [5]
https://eur02.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcommons.wi...