Hello Wikidata community!
Wikidata is a great platform for collecting information, and the high
quality work of many authors yields very reliable information.
Still, a challenge for users of Wikidata is that there is no way to see
whether *all* data on a certain topic is in Wikidata.
For instance, it is easy to see that Malia and Sasha are children of Obama,
but there is no way to specify that these are all his children.
More generally, Wikidata stores many facts, but it stores no information
about which topic it contains all facts.
Today we are happy to share with you a prototype that allows to add and
manage such completeness information,
and would be happy to get your feedback on how useful you consider this
tool, or where you see space for improvements.
With our prototype, called COOL-WD (Completeness Tool for Wikidata), one
can:
1. See completeness statements for Wikidata facts
2. Add, remove, aggregate and filter completeness statements
3. See how completeness statements allow conclusions about the completeness
of SPARQL queries over Wikidata.
COOL-WD is available at
http://cool-wd.inf.unibz.it/ and a 3-min demo video
can be found at
http://cool-wd.inf.unibz.it/coolwd-hd.mp4
It employs various libraries, most importantly GWT, Apache Jena, SQLite and
the Wikidata API.
The formal background and description of the tool including an indexing
technique for completeness statements
have been accepted as a research paper at ICWE 2016 (
http://icwe2016.inf.usi.ch/) available to download at:
http://bit.ly/1VOsRCH
Below are some naive ideas of how completeness could be useful to users:
Use Case 1: Rido is a geographer who would like to
contribute to Wikidata
about the administrative divisions of regions.
He cares so much about data quality, especially data completeness, and is
collaborating with Simon, another geographer.
However, when completing data on Wikidata, there is currently no way to
mark which data is complete.
Rido and Simon must make these notes about completeness manually in, say, a
Google Doc.
Worse still, the effort from Rido and Simon to complete data could not be
appreciated by Wikidata users since to the users’ eyes,
there is no difference between complete data and incomplete data on
Wikidata.
Demo: Wikidata is complete for all administrative divisions of Saxony (
http://cool-wd.inf.unibz.it/?p=Q1202)
Use Case 2: Jen is a developer of a moviegoer
application. She usually
integrates data between multiple sources including
Wikidata.
If some movies on Wikidata have completeness statements, she might optimize
her application
to not search in other data sources for those movies.
Demo: So, when her app is asking on COOL-WD at
http://cool-wd.inf.unibz.it/?p=query
for cast and screenwriters of the movie Before Sunset (
http://cool-wd.inf.unibz.it/?p=Q652186):
SELECT * WHERE { wd:Q652186 wdt:P161 ?c . wd:Q652186 wdt:P58 ?s }
Her app gets not only query answers but also the completeness information
of her query.
We are looking forward to your feedback!
Best,
Fariz, Simon, Rido, and Werner
Free University of Bozen-Bolzano, Italy