Our plan here is to map all Wikidata properties to the DBpedia Ontology and
then have the info to compare coverage of Wikidata
with all infoboxes
This is a really exciting project that would improve both Wikidata and
DBpedia. I would be interested to know more, especially on what has already
been done in terms of mapping and what remains to do.
I see, for example, that DBpedia has a list of missing properties and
but I don't know if it's up to date.
2018-01-15 19:57 GMT+01:00 Magnus Knuth <knuth(a)informatik.uni-leipzig.de>de>:
last year, we applied for a Wikimedia grant to feed qualified data from
Wikipedia infoboxes (i.e. missing statements with references) via the
DBpedia software into Wikidata. The evaluation was already quite good, but
some parts were still missing and we would like to ask for your help and
feedback for the next round. The new application is here:
The main purpose of the grant is:
- Wikipedia infoboxes are quite rich, are manually curated and have
references. DBpedia is already extracting that data quite well (i.e. there
is no other software that does it better). However, extracting references
is not a priority on our agenda. They would be very useful to Wikidata, but
there are no user requests for this from DBpedia users.
- DBpedia also has all the infos of all infoboxes of all Wikipedia
editions (>10k pages), so we also know quite well, where Wikidata is used
already and where information is available in Wikidata or one language
version and missing in another.
- side-goal: bring the Wikidata, Wikipedia and DBpedia communities closer
Here is a diff between the old an new proposal:
- extraction of infobox references will still be a goal of the reworked
- we have been working on the fusion and data comparison engine (the part
of the budget that came from us) for a while now and there are first
We only took three properties for now and showed the gain where no
Wikidata statement was available. birthDate/deathDate is already quite
good. Details here: https://drive.google.com/file/
Our plan here is to map all Wikidata properties to the DBpedia Ontology
and then have the info to compare coverage of Wikidata with all infoboxes
- we will remove the text extraction part from the old proposal (which is
here for you reference: https://meta.wikimedia.org/
wiki/Grants:Project/DBpedia/CrossWikiFact). This will still be a focus
during our work in 2018, together with Diffbot and the new DBpedia NLP
department, but we think that it distracted from the core of the proposal.
Results from the Wikipedia article text extraction can be added later once
they are available and discussed separately.
- We proposed to make an extra website that helps to synchronize all
Wikipedias and Wikidata with DBpedia as its backend. While the external
website is not an ideal solution, we are lacking alternatives. The Primary
Sources Tool is mainly for importing data into Wikidata, not so much
synchronization. The MediaWiki instances of the Wikipedias do not seem to
have any good interfaces to provide suggestions and pinpoint missing info.
Especially to this part, we would like to ask for your help and
suggestions, either per mail to the list or on the talk page:
We are looking forward to a fruitful collaboration with you and we thank
you for your feedback!
All the best
Institut für Informatik
Abt. Betriebliche Informationssysteme, AKSW/KILT
04109 Leipzig DE
tel: +49 177 3277537
Wikidata mailing list