Dear all,


TL;DR; We are working on an *experimental* Wikidata RDF export based on DBpedia and would like some feedback on our future directions.

Disclaimer: this work is not related or affiliated with the official Wikidata RDF dumps.


Our current approach is to use Wikidata like all other Wikipedia editions and apply our extractors to each Wikidata page (item). This approach generates triples in the DBpedia domain (http://wikidata.dbpedia.org/resource/). Although this results in duplication, since Wikidata already provides RDF, we made some different design choices and map wikidata data directly into the DBpedia ontology.


sample data: http://nl.dbpedia.org/downloads/wikidatawiki/sample/

experimental dump: http://nl.dbpedia.org/downloads/wikidatawiki/20150207/ (errors see below)


*Wikidata mapping details*


In the same way we use mappings.dbpedia.org to define mappings from Wikipedia templates to the DBpedia ontology, we define transformation mappings from Wikidata properties to RDF triples in the DBpedia ontology.


At the moment we provide two types of Wikidata property mappings:

a)  through the mappings wiki in the form of equivalent classes or properties e.g.

property: http://mappings.dbpedia.org/index.php/OntologyProperty:BirthDate

Class: http://mappings.dbpedia.org/index.php/OntologyClass:Person


which will result in the following triples:

wd:Qx a dbo:Person

wd:Qx dbo:birthDate “....”


b) transformation mappings that are (for now) defined in a json file [1]. At the moment we provide the following mappings options:



Also note that we can define multiple mappings per property to get the Wikidata data closer to the DBpedia RDF exports e.g.:


"P625": [

{"rdf:type":"http://www.w3.org/2003/01/geo/wgs84_pos#SpatialThing"},

{"geo:lat":"$getLatitude"},

{"geo:long": "$getLongitude"},

{"georss:point":"$getGeoRss"}],

"P18": [

{"thumbnail":"http://commons.wikimedia.org/wiki/Special:FilePath/$1?width=300"},

{"foaf:depiction":"http://commons.wikimedia.org/wiki/Special:FilePath/$1"}],


*Qualifiers & reification*

Like Wikidata we provide a simplified dump without qualifiers and a reified dump with qualifiers. However, for the reification we chose simple RDF reification in order to reuse the DBpedia ontology as much as possible. The reified dumps are also mapped using the same configuration.


*Labels, descriptions, aliases and interwiki links*

We additionally defined extractors to get data other than statements. For textual data we split the dumps to the languages that are enabled in the mappings wiki and all the rest. We extract aliases, labels, descriptions, site links. For interwiki links we provide links between Wikidata and DBpedia as well as links between different DBpedia language editions.

*Properties*

We also fully extract wikidata property pages. However, for now we don’t apply any mappings to wikidata properties.


*DBpedia extractors*

Some existing DBpedia extractors also apply in Wikidata that provide versioning and provenance (e.g. pageID, revisionID, etc)


*Help & Feedback*

Although this is a work in progress we wanted to announce it early and get you feedback on the following:


It would be great if you could help us map more data. The easiest way is through the mappings wiki where you can define equivalent classes & properties. See what is missing here: http://mappings.dbpedia.org/server/ontology/wikidata/missing/

You can also provide json configuration but until the code is merged it will not be easy with PRs.


Until the code is merged in the main DBpedia repo you can check it out from here:

https://github.com/alismayilov/extraction-framework/tree/wikidataAllCommits


Notes:


Best,

Ali Ismayilov, Dimitris Kontokostas, Sören Auer


[1] https://github.com/alismayilov/extraction-framework/blob/wikidataAllCommits/dump/config.json



--
Dimitris Kontokostas
Department of Computer Science, University of Leipzig
Research Group: http://aksw.org
Homepage:http://aksw.org/DimitrisKontokostas