Hi,
I am doing some 'kicking the tyres' tests on Wikidata as Linked Data. I like the SPARQL end-point, which is more helpful than most, and successfully managed a query for "people with the surname Light" last night. (Only five of them in the world, apparently, but that's another matter. :-) )
What I do have an issue with is the content negotiation. I kept failing to get an RDF rendition of my results, and as a last resort I read the documentation [1].
This described a postfix pattern which delivers RDF XML (e.g. [2]). However, this pattern is itself subject to content negotiation, and an initial 303 response converts the URL to e.g. [3]. I am interested in knowing what pattern of URL will deliver RDF/XML /without /requiring content negotiation, and the answer to that question is not [2] but [3]. This matters, for example, in scenarios where one wants to use XSLT's document() function to retrieve an RDF XML response directly. The URL pattern [2] will fail. So the documentation is currently unhelpful.
In a similar vein, is there a syntax for running a SPARQL query on Wikidata such that the response is delivered as RDF XML? In many end-points there is a parameter you can add to specify the response format, which allows you to submit searches as HTTP requests and include the results directly in your (in my case XML-based) processing chain. An HTML results page isn't very machine-processible!
Thanks,
Richard
[1] https://www.wikidata.org/wiki/Wikidata:Data_access [2] https://www.wikidata.org/entity/Q3807415.rdf [3] https://www.wikidata.org/wiki/Special:EntityData/Q3807415.rdf
Hi Richard,
I can only answer the last part of your question but you can access the SPARQL endpoint directly under wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql?query= and I think you can also add a format parameter to specify in which format the result should be returned.
Best regards Bene
Am 27.08.2015 um 09:20 schrieb Richard Light:
Hi,
I am doing some 'kicking the tyres' tests on Wikidata as Linked Data. I like the SPARQL end-point, which is more helpful than most, and successfully managed a query for "people with the surname Light" last night. (Only five of them in the world, apparently, but that's another matter. :-) )
What I do have an issue with is the content negotiation. I kept failing to get an RDF rendition of my results, and as a last resort I read the documentation [1].
This described a postfix pattern which delivers RDF XML (e.g. [2]). However, this pattern is itself subject to content negotiation, and an initial 303 response converts the URL to e.g. [3]. I am interested in knowing what pattern of URL will deliver RDF/XML /without /requiring content negotiation, and the answer to that question is not [2] but [3]. This matters, for example, in scenarios where one wants to use XSLT's document() function to retrieve an RDF XML response directly. The URL pattern [2] will fail. So the documentation is currently unhelpful.
In a similar vein, is there a syntax for running a SPARQL query on Wikidata such that the response is delivered as RDF XML? In many end-points there is a parameter you can add to specify the response format, which allows you to submit searches as HTTP requests and include the results directly in your (in my case XML-based) processing chain. An HTML results page isn't very machine-processible!
Thanks,
Richard
[1] https://www.wikidata.org/wiki/Wikidata:Data_access [2] https://www.wikidata.org/entity/Q3807415.rdf [3] https://www.wikidata.org/wiki/Special:EntityData/Q3807415.rdf
-- *Richard Light*
Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
Bene,
Thanks for this. Yes, that works.
For SELECT queries, the response format is application/sparql-results+xml by default (even without specifying the format parameter), and the only other format I could persuade it to offer was JSON (with format=json).
For CONSTRUCT queries (the useful sort :-) ), you need format=rdf (or format=application/rdf+xml), which returns a lovely RDF XML document. If you omit the format parameter, you get a 'File not found' error.
If you put format=json, you do get a JSON response, but (a) the file type isn't specified: the response filename is just the generic 'sparql', and (b) the JSON that is returned looks a bit reified to me. Others can comment on whether this is what they would expect from serializing the results of a CONSTRUCT query in JSON. [1]
Richard
[1] http://wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql?query=prefix%20wdt...
On 27/08/2015 08:24, Bene* wrote:
Hi Richard,
I can only answer the last part of your question but you can access the SPARQL endpoint directly under wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql?query= and I think you can also add a format parameter to specify in which format the result should be returned.
Best regards Bene
Am 27.08.2015 um 09:20 schrieb Richard Light:
Hi,
I am doing some 'kicking the tyres' tests on Wikidata as Linked Data. I like the SPARQL end-point, which is more helpful than most, and successfully managed a query for "people with the surname Light" last night. (Only five of them in the world, apparently, but that's another matter. :-) )
What I do have an issue with is the content negotiation. I kept failing to get an RDF rendition of my results, and as a last resort I read the documentation [1].
This described a postfix pattern which delivers RDF XML (e.g. [2]). However, this pattern is itself subject to content negotiation, and an initial 303 response converts the URL to e.g. [3]. I am interested in knowing what pattern of URL will deliver RDF/XML /without /requiring content negotiation, and the answer to that question is not [2] but [3]. This matters, for example, in scenarios where one wants to use XSLT's document() function to retrieve an RDF XML response directly. The URL pattern [2] will fail. So the documentation is currently unhelpful.
In a similar vein, is there a syntax for running a SPARQL query on Wikidata such that the response is delivered as RDF XML? In many end-points there is a parameter you can add to specify the response format, which allows you to submit searches as HTTP requests and include the results directly in your (in my case XML-based) processing chain. An HTML results page isn't very machine-processible!
Thanks,
Richard
[1] https://www.wikidata.org/wiki/Wikidata:Data_access [2] https://www.wikidata.org/entity/Q3807415.rdf [3] https://www.wikidata.org/wiki/Special:EntityData/Q3807415.rdf
-- *Richard Light*
Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
Hi!
For SELECT queries, the response format is application/sparql-results+xml by default (even without specifying the format parameter), and the only other format I could persuade it to offer was JSON (with format=json).
Yes, the query supports only xml and json formats now.
For CONSTRUCT queries (the useful sort :-) ), you need format=rdf (or format=application/rdf+xml), which returns a lovely RDF XML document. If you omit the format parameter, you get a 'File not found' error.
For these, you can do both XML and JSON too. What you could do is run CONSTRUCT query, which would return JSON having three fields - subject, predicate and object - and then make a simple script in any language that reads that JSON and processes it to output, say, ntriple-formatted RDF. That would be not hard.
As this is just starting the SPARQL story, it would be nice to see suggestions about how we could format the output better... Maybe some export options in the GUI. JSON has all the data, but some processing is required to get CONSTRUCT produce some useful RDF serialization. At least for now.
BTW it's open source (the GUI too) so pulls to https://github.com/wikimedia/wikidata-query-rdf/tree/master/gui may also be the way to do it ;)
On 27/08/2015 23:37, Stas Malyshev wrote:
As this is just starting the SPARQL story, it would be nice to see suggestions about how we could format the output better... Maybe some export options in the GUI. JSON has all the data, but some processing is required to get CONSTRUCT produce some useful RDF serialization. At least for now.
I think help with query building is at least as important as serialization of results. If you can't work out how to find anything, there will be no results to serialize. :-) Every SPARQL end-point exposes differently-structured RDF, so the first job a newcomer has is to try to work out what classes and properties are in there, and how they relate to each other.
So it would be good to have a guided query builder, which starts off "I want to find ..." with a drop-down list of classes (possibly complete, possibly a selection of the 'key' ones). You select a class, and a second line pops up with a list of properties for that class. Select one, and you get a text box with autocomplete to type a value into, etc. So the query builder is itself using SPARQL queries to provide context-relevant options for the searcher. When they have the result they want, the system can give them the SPARQL query which generated their result, for future reference and/or hand-editing. But they don't actually /have /to write any SPARQL themselves.
Richard
Hey,
maybe fixing this tasks would help new users to create their own SPARQL queries: https://phabricator.wikimedia.org/T101693
Best regards Bene
Am 28.08.2015 um 10:45 schrieb Richard Light:
On 27/08/2015 23:37, Stas Malyshev wrote:
As this is just starting the SPARQL story, it would be nice to see suggestions about how we could format the output better... Maybe some export options in the GUI. JSON has all the data, but some processing is required to get CONSTRUCT produce some useful RDF serialization. At least for now.
I think help with query building is at least as important as serialization of results. If you can't work out how to find anything, there will be no results to serialize. :-) Every SPARQL end-point exposes differently-structured RDF, so the first job a newcomer has is to try to work out what classes and properties are in there, and how they relate to each other.
So it would be good to have a guided query builder, which starts off "I want to find ..." with a drop-down list of classes (possibly complete, possibly a selection of the 'key' ones). You select a class, and a second line pops up with a list of properties for that class. Select one, and you get a text box with autocomplete to type a value into, etc. So the query builder is itself using SPARQL queries to provide context-relevant options for the searcher. When they have the result they want, the system can give them the SPARQL query which generated their result, for future reference and/or hand-editing. But they don't actually /have /to write any SPARQL themselves.
Richard
*Richard Light*
Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech
Hi Richard!
You are right, the Data access documentation did not properly distinguish between concept URI and data URL. I have fixed this now. URLs of the form https://www.wikidata.org/entity/Q3807415.rdf are not canonical at all - we support them for convenience, but they should not be used as identifiers, and should not show up in our output anywhere.
You can see the changes I made to the documentation here: https://www.wikidata.org/w/index.php?title=Wikidata%3AData_access&type=r...
Sorry about the confusion.
-- daniel
Am 27.08.2015 um 09:20 schrieb Richard Light:
This described a postfix pattern which delivers RDF XML (e.g. [2]). However, this pattern is itself subject to content negotiation, and an initial 303 response converts the URL to e.g. [3]. I am interested in knowing what pattern of URL will deliver RDF/XML /without /requiring content negotiation, and the answer to that question is not [2] but [3]. This matters, for example, in scenarios where one wants to use XSLT's document() function to retrieve an RDF XML response directly. The URL pattern [2] will fail. So the documentation is currently unhelpful.
[...]
[1] https://www.wikidata.org/wiki/Wikidata:Data_access [2] https://www.wikidata.org/entity/Q3807415.rdf [3] https://www.wikidata.org/wiki/Special:EntityData/Q3807415.rdf
Hi Daniel,
Thanks for updating the page. I think, however, you have been a bit over-zealous in changing the URLs. The sentence:
For example, the /concept URI/ of Douglas Adams is http://www.wikidata.org/wiki/Special:EntityData/Q42.
should I think use the concept namespace http://www.wikidata.org/entity/.
No problems about any confusion: I see it as a useful learning exercise. For example, I was also going to comment on your double-redirect strategy, but having checked it in Vapour I then re-read the Cool URIs document and realised that it is an impeccable implementation of one of the strategies described there. So now I know more than I did yesterday. :-)
Best wishes,
Richard
On 27/08/2015 11:37, Daniel Kinzler wrote:
Hi Richard!
You are right, the Data access documentation did not properly distinguish between concept URI and data URL. I have fixed this now. URLs of the form https://www.wikidata.org/entity/Q3807415.rdf are not canonical at all - we support them for convenience, but they should not be used as identifiers, and should not show up in our output anywhere.
You can see the changes I made to the documentation here: https://www.wikidata.org/w/index.php?title=Wikidata%3AData_access&type=r...
Sorry about the confusion.
-- daniel
Am 27.08.2015 um 09:20 schrieb Richard Light:
This described a postfix pattern which delivers RDF XML (e.g. [2]). However, this pattern is itself subject to content negotiation, and an initial 303 response converts the URL to e.g. [3]. I am interested in knowing what pattern of URL will deliver RDF/XML /without /requiring content negotiation, and the answer to that question is not [2] but [3]. This matters, for example, in scenarios where one wants to use XSLT's document() function to retrieve an RDF XML response directly. The URL pattern [2] will fail. So the documentation is currently unhelpful.
[...]
[1] https://www.wikidata.org/wiki/Wikidata:Data_access [2] https://www.wikidata.org/entity/Q3807415.rdf [3] https://www.wikidata.org/wiki/Special:EntityData/Q3807415.rdf
On Thu, Aug 27, 2015 at 12:49 PM, Richard Light richard@light.demon.co.uk wrote:
Hi Daniel,
Thanks for updating the page. I think, however, you have been a bit over-zealous in changing the URLs. The sentence:
For example, the concept URI of Douglas Adams is http://www.wikidata.org/wiki/Special:EntityData/Q42.
should I think use the concept namespace http://www.wikidata.org/entity/.
Daniel actually had it right in the wikitext source but somehow the translation grabled it. I fixed it now.
Cheers Lydia
wikidata-tech@lists.wikimedia.org