Hi,
I recently set up a nightly cronjob that adds Wikidata links to the YSO Places SKOS dataset. The way it works is that each time, it runs a rather simple SPARQL CONSTRUCT query [1] against the Wikidata SPARQL endpoint (script [2]) that looks for YSO ID properties (P2347), and stores the result as sorted N-Triples into a file [3] that then gets committed to GitHub. These triples are then incorporated into the YSO Places data set [4] and published in Finto.fi.
But I've noticed that each time this SPARQL query is executed, a few triples that were in the previous version get dropped and others get reinstated. It seems to me that the Wikidata SPARQL endpoint is randomly (?) dropping some triples from the result set.
As an example, the skos:closeMatch triple that links yso:p109659 ("Laanila, Oulu") to wd:Q11874312 was there yesterday [5] but not in today's version [6]. The last edit in Wikidata was made 3 weeks ago [7] so nothing in the RDF data available through the Wikidata SPARQL endpoint should have changed.
Is this a known problem? Am I missing something here? Is there something wrong with the approach of running a CONSTRUCT query against the Wikidata endpoint and expecting to get the same result (of around 4200 triples) each time, unless the underlying data in Wikidata has changed?
-Osma
[1] https://github.com/NatLibFi/Finto-data/blob/master/vocabularies/yso-paikat/w...
[2] https://github.com/NatLibFi/Finto-data/blob/master/vocabularies/yso-paikat/t...
[3] https://github.com/NatLibFi/Finto-data/blob/master/vocabularies/yso-paikat/w...
[4] https://github.com/NatLibFi/Finto-data/blob/master/vocabularies/yso-paikat/y...
[5] https://github.com/NatLibFi/Finto-data/blob/9ea0e34d0814b1aa7a8ac39597d395a7...
[6] https://github.com/NatLibFi/Finto-data/blob/6af4389c9f85b0b2bd1f201f29407388...
[7] https://www.wikidata.org/w/index.php?title=Q11874312&action=history
Working on a 'fixing fake news' thing. It would be good if specific values were available as a URI.
ie: US population: https://www.wikidata.org/wiki/Q30#population20160709 (= 323,952,889)
Tim.
2017-08-29 15:13 GMT+02:00 Timothy Holborn timothy.holborn@gmail.com:
Working on a 'fixing fake news' thing. It would be good if specific values were available as a URI.
ie: US population: https://www.wikidata.org/wiki/Q30#population20160709 (= 323,952,889)
Tim.
Good idea.
The format should probably be consistent with the current link to properties : https://www.wikidata.org/wiki/Q30#P1082 , maybe something like https://www.wikidata.org/wiki/Q30#P1082-20160709
https://www.wikidata.org/wiki/Q30#population20160709 Cdlt, ~nicolas
Hi!
Is this a known problem? Am I missing something here? Is there something wrong with the approach of running a CONSTRUCT query against the Wikidata endpoint and expecting to get the same result (of around 4200 triples) each time, unless the underlying data in Wikidata has changed?
This should be fine. The data should not be different. The difference might be because some servers are out of sync. I'll check it.