Hi Everyone,
Starting from this release DBpedia provides an alternate view of its data
with RDF dumps based on Wikidata IDs
http://wiki.dbpedia.org/dbpedia-version-2016-04
e.g.
- disambiguations_en.ttl.bz2
<http://downloads.dbpedia.org/2016-04/core-i18n/en/disambiguations_en.ttl.bz2>
(DBpedia
uris)
- disambiguations_wkd_uris_en.ttl.bz2
<http://downloads.dbpedia.org/2016-04/core-i18n/en/disambiguations_wkd_uris_en.ttl.bz2>
(the
same data but all DBpedia URIs are converted to wikidata based IDs)
We need these dumps for our ongoing tasks but we also want to share these
with the Wikidata community as we think they may be useful.
One of the side tasks that we have in our plans but never found enough
people to work on is to identify Wikipedia / Wikidata data overlaps as well
as data conflicts and identify areas where e.g. Wikidata data are fresher,
stalled or missing.
Another task that that pop up during a discussion with Lydia and Daniel in
the DBpedia meeting in Leipzig last month was to use these dumps and fix
errors in Wikidata. The example we discussed is with interlinks and
disambiguations when e.g. an interlink cluster consists of disambiguation
links except one (that is most probably wrong).
This was a real example that Daniel came up with and can be easily
identified with these dumps
Maybe there are other cases where these dumps can be useful but you can
have a better judge on this.
How to move on.
After a quick discussion, it was suggested to create tasks in Phabricator
for each task but before I proceed I wanted to get an initial community
feedback
Best,
Dimitris
--
Dimitris Kontokostas
Department of Computer Science, University of Leipzig & DBpedia Association
Projects:
http://dbpedia.org,
http://rdfunit.aksw.org,
http://aligned-project.eu
Homepage:
http://aksw.org/DimitrisKontokostas
Research Group: AKSW/KILT
http://aksw.org/Groups/KILT