On Wed, Dec 9, 2020 at 6:22 PM <petr.kadlec@gmail.com> wrote:

On Wed, Dec 9, 2020 at 6:17 PM Baskauf, Steven James <steve.baskauf@vanderbilt.edu> wrote:

When performing a SPARQL query at the WD Query Service (example: https://w.wiki/ptp), these value nodes are identified by an IRI such as wdv: 742521f02b14bf1a6cbf7d4bc599eb77 (http://www.wikidata.org/value/742521f02b14bf1a6cbf7d4bc599eb77). The local name part of this IRI seems to be a hash of something. However, when I compare the hash values from the snak JSON returned from the API for the same value node (see https://gist.github.com/baskaufs/8c86bc5ceaae19e31fde88a2880cf0e9 for the example), the hash associated with the value node (35976d7cb070b06a2dec1482aaca2982df3fedd4 in this case) does not have any relationship to the local name part if the IRI for that value node.

 

> Full values are represented as nodes having prefix wdv: and the local name being the hash of the value contents (e.g. wdv:382603eaa501e15688076291fc47ae54). There is no guarantee of the value of the hash except for the fact that different values will be represented by different hashes, and same value mentioned in different places will have the same hash.

 

 

indeed no assumptions should be made on this hash value, the initial goal was (I think) for two unrelated claims that have the same complex value elements to share it instead of reifying one for each claim/reference. I would strongly advise against storing this hash for later use.

 

About 35976d7cb070b06a2dec1482aaca2982df3fedd4 which I think you obtained from the wbgetclaims api[0]? I think this hash identifies the Snak while the one you see from the query service identifies the Value, the former will uniquely identifies the Snak so that for another entity using the same value[1] the Snak hash is different (8eb6208639efa82b5e7e4c709b7d18cbfca67411) but the value is identical (+2019-12-14T00:00:00Z).

 

I don't think you can extract the hash of the value using wbgetclaims but it is visible using the RDF output[2].

 

0: https://www.wikidata.org/w/api.php?action=wbgetclaims&entity=Q42352198&property=P496&formatversion=2

1: https://www.wikidata.org/w/api.php?action=wbgetclaims&entity=Q232113&property=P570&formatversion=2

2: https://www.wikidata.org/wiki/Special:EntityData/Q42352198.ttl?flavor=dump

-----

The difference between the snak and the value makes sense to me now. I was obtaining the hash from the JSON that is returned from the API as a response to a write operation. I wanted to use the hash as a way to record that the value had been written to the API. The problem with using the Query Service to get the value node IRI is the delay in the Query Service updater -- the value node IRI would not be immediately available from the Query Service but the response JSON from the API is available immediately after writing. Is the RDF output [2] also dependent on the Query Service Updater, or is it immediately available without the lag?

 

Thanks for taking the time to answer my question.

 

Steve