Thanks Stas and Markus. I'm interested in computing various stats about Wikidata. For example, I want to compute the degree of interlinking between Wikidata and external databases, per entity type, per databases, etc. So I need a way to know which properties have an external identifier as range, along with the name of the external database they point to. For example P345 is an external identifier to IMDB ; P2639 is an external identifier to Filmportal, etc.
Hence my question about machine-readable Wikidata schemas and data. Parsing the data is a no brainer since they are available as JSON and RDF. I already use the JSON dump since the RDF dump is marked as beta. However, I couldn't find a machine readable version of the Wikidata schemas, with a formal description of the classes, properties and how they relate to each others. I'd like to avoid scraping and/or hard-coding things myself. Cheers.Nicolas. --Nicolas TorzecYahoo Labs.
On Thursday, June 23, 2016 11:51 AM, Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
With the small number of properties, it should also be easy to get much of their data with a SPARQL query (depending on what you need). Does BlazeGraph support CONSTRUCT?
Yes. For example, this one: http://preview.tinyurl.com/hk5sudz
should produce a list of property definitions for WikibaseItem type. These are already part of the dump, but they work as an illustration.
Right now there's no way to get data into TTL RDF serialization (maybe in the future) but XML one works: https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual#Supported_...