Hi everyone. I have a question about the Wikidata xml dump, but I'm posting this question here, because it looks more related to Wikidata.
In short, it seems that the "pages-articles.xml" does not include the datatype property for snaks. For example, the xml dump does not list a datatype for Q38 (Italy) and P41 (flag image). In contrast, the json dump does list a datatype of "commonsMedia".
Can this datatype property be included in future xml dumps? The alternative would be to download two large and redundant dumps (xml and json) in order to reconstruct a Wikidata instance.
More information is provided below the break. Let me know if you need anything else.
Thanks.
----
Here's an excerpt from the xml data dump for Q38 (Italy) and P41 (flag image). Notice that there is no "datatype" property // https://dumps.wikimedia.org/wikidatawiki/20161120/wikidatawiki-20161120-page... "mainsnak": { "snaktype": "value", "property": "P41", "hash": "a3bd1e026c51f5e0bdf30b2323a7a1fb913c9863", "datavalue": { "value": "Flag of Italy.svg", "type": "string" } },
Meanwhile, the API and the JSON dump lists a datatype property of "commonsMedia": // https://www.wikidata.org/w/api.php?action=wbgetentities&ids=q38 // https://dumps.wikimedia.org/wikidatawiki/entities/20161114/wikidata-20161114... "P41": [{ "mainsnak": { "snaktype": "value", "property": "P41", "datavalue": { "value": "Flag of Italy.svg", "type": "string" }, "datatype": "commonsMedia" },
As far as I can tell, the Turtle (ttl) dump does not list a datatype property either, but this may be because I don't understand its format. wd:Q38 p:P41 wds:q38-574446A6-FD05-47AE-86E3-AA745993B65D . wds:q38-574446A6-FD05-47AE-86E3-AA745993B65D a wikibase:Statement, wikibase:BestRank ; wikibase:rank wikibase:NormalRank ; ps:P41 http://commons.wikimedia.org/wiki/Special:FilePath/Flag%20of%20Italy.svg ; pq:P580 "1946-06-19T00:00:00Z"^^xsd:dateTime ; pqv:P580 wdv:204e90b1bce9f96d6d4ff632a8da0ecc .