Content negotiation

List overview All Threads
Download

newer

older

Fetching properties by label or by...

Question on bot flag in...

Richard Light

27 Aug 2015 27 Aug '15

9:20 a.m.

Hi,

I am doing some 'kicking the tyres' tests on Wikidata as Linked Data. I like the SPARQL end-point, which is more helpful than most, and successfully managed a query for "people with the surname Light" last night. (Only five of them in the world, apparently, but that's another matter. :-) )

What I do have an issue with is the content negotiation. I kept failing to get an RDF rendition of my results, and as a last resort I read the documentation [1].

This described a postfix pattern which delivers RDF XML (e.g. [2]). However, this pattern is itself subject to content negotiation, and an initial 303 response converts the URL to e.g. [3]. I am interested in knowing what pattern of URL will deliver RDF/XML /without /requiring content negotiation, and the answer to that question is not [2] but [3]. This matters, for example, in scenarios where one wants to use XSLT's document() function to retrieve an RDF XML response directly. The URL pattern [2] will fail. So the documentation is currently unhelpful.

In a similar vein, is there a syntax for running a SPARQL query on Wikidata such that the response is delivered as RDF XML? In many end-points there is a parameter you can add to specify the response format, which allows you to submit searches as HTTP requests and include the results directly in your (in my case XML-based) processing chain. An HTML results page isn't very machine-processible!

Thanks,

Richard

[1] https://www.wikidata.org/wiki/Wikidata:Data_access [2] https://www.wikidata.org/entity/Q3807415.rdf [3] https://www.wikidata.org/wiki/Special:EntityData/Q3807415.rdf

-- *Richard Light*

Attachments:

attachment.htm (text/html — 2.4 KB)

Show replies by date

Bene*

27 Aug 27 Aug

9:24 a.m.

Hi Richard,

I can only answer the last part of your question but you can access the SPARQL endpoint directly under wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql?query= and I think you can also add a format parameter to specify in which format the result should be returned.

Best regards Bene

Am 27.08.2015 um 09:20 schrieb Richard Light:

...

Hi,

I am doing some 'kicking the tyres' tests on Wikidata as Linked Data. I like the SPARQL end-point, which is more helpful than most, and successfully managed a query for "people with the surname Light" last night. (Only five of them in the world, apparently, but that's another matter. :-) )

What I do have an issue with is the content negotiation. I kept failing to get an RDF rendition of my results, and as a last resort I read the documentation [1].

This described a postfix pattern which delivers RDF XML (e.g. [2]). However, this pattern is itself subject to content negotiation, and an initial 303 response converts the URL to e.g. [3]. I am interested in knowing what pattern of URL will deliver RDF/XML /without /requiring content negotiation, and the answer to that question is not [2] but [3]. This matters, for example, in scenarios where one wants to use XSLT's document() function to retrieve an RDF XML response directly. The URL pattern [2] will fail. So the documentation is currently unhelpful.

In a similar vein, is there a syntax for running a SPARQL query on Wikidata such that the response is delivered as RDF XML? In many end-points there is a parameter you can add to specify the response format, which allows you to submit searches as HTTP requests and include the results directly in your (in my case XML-based) processing chain. An HTML results page isn't very machine-processible!

Thanks,

Richard

[1] https://www.wikidata.org/wiki/Wikidata:Data_access [2] https://www.wikidata.org/entity/Q3807415.rdf [3] https://www.wikidata.org/wiki/Special:EntityData/Q3807415.rdf

-- *Richard Light*

Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech

Richard Light

9:57 a.m.

Bene,

Thanks for this. Yes, that works.

For SELECT queries, the response format is application/sparql-results+xml by default (even without specifying the format parameter), and the only other format I could persuade it to offer was JSON (with format=json).

For CONSTRUCT queries (the useful sort :-) ), you need format=rdf (or format=application/rdf+xml), which returns a lovely RDF XML document. If you omit the format parameter, you get a 'File not found' error.

If you put format=json, you do get a JSON response, but (a) the file type isn't specified: the response filename is just the generic 'sparql', and (b) the JSON that is returned looks a bit reified to me. Others can comment on whether this is what they would expect from serializing the results of a CONSTRUCT query in JSON. [1]

Richard

[1] http://wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql?query=prefix%20wdt...

On 27/08/2015 08:24, Bene* wrote:

...

Hi Richard,

I can only answer the last part of your question but you can access the SPARQL endpoint directly under wdqs-beta.wmflabs.org/bigdata/namespace/wdq/sparql?query= and I think you can also add a format parameter to specify in which format the result should be returned.

Best regards Bene

Am 27.08.2015 um 09:20 schrieb Richard Light:

...
Hi,

I am doing some 'kicking the tyres' tests on Wikidata as Linked Data. I like the SPARQL end-point, which is more helpful than most, and successfully managed a query for "people with the surname Light" last night. (Only five of them in the world, apparently, but that's another matter. :-) )

What I do have an issue with is the content negotiation. I kept failing to get an RDF rendition of my results, and as a last resort I read the documentation [1].

This described a postfix pattern which delivers RDF XML (e.g. [2]). However, this pattern is itself subject to content negotiation, and an initial 303 response converts the URL to e.g. [3]. I am interested in knowing what pattern of URL will deliver RDF/XML /without /requiring content negotiation, and the answer to that question is not [2] but [3]. This matters, for example, in scenarios where one wants to use XSLT's document() function to retrieve an RDF XML response directly. The URL pattern [2] will fail. So the documentation is currently unhelpful.

In a similar vein, is there a syntax for running a SPARQL query on Wikidata such that the response is delivered as RDF XML? In many end-points there is a parameter you can add to specify the response format, which allows you to submit searches as HTTP requests and include the results directly in your (in my case XML-based) processing chain. An HTML results page isn't very machine-processible!

Thanks,

Richard

[1] https://www.wikidata.org/wiki/Wikidata:Data_access [2] https://www.wikidata.org/entity/Q3807415.rdf [3] https://www.wikidata.org/wiki/Special:EntityData/Q3807415.rdf

-- *Richard Light*

Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech

Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech

-- *Richard Light*

Stas Malyshev

28 Aug 28 Aug

12:37 a.m.

Hi!

...

For SELECT queries, the response format is application/sparql-results+xml by default (even without specifying the format parameter), and the only other format I could persuade it to offer was JSON (with format=json).

Yes, the query supports only xml and json formats now.

...

For CONSTRUCT queries (the useful sort :-) ), you need format=rdf (or format=application/rdf+xml), which returns a lovely RDF XML document. If you omit the format parameter, you get a 'File not found' error.

For these, you can do both XML and JSON too. What you could do is run CONSTRUCT query, which would return JSON having three fields - subject, predicate and object - and then make a simple script in any language that reads that JSON and processes it to output, say, ntriple-formatted RDF. That would be not hard.

As this is just starting the SPARQL story, it would be nice to see suggestions about how we could format the output better... Maybe some export options in the GUI. JSON has all the data, but some processing is required to get CONSTRUCT produce some useful RDF serialization. At least for now.

BTW it's open source (the GUI too) so pulls to https://github.com/wikimedia/wikidata-query-rdf/tree/master/gui may also be the way to do it ;)

-- Stas Malyshev smalyshev@wikimedia.org

Richard Light

10:45 a.m.

On 27/08/2015 23:37, Stas Malyshev wrote:

...

As this is just starting the SPARQL story, it would be nice to see suggestions about how we could format the output better... Maybe some export options in the GUI. JSON has all the data, but some processing is required to get CONSTRUCT produce some useful RDF serialization. At least for now.

I think help with query building is at least as important as serialization of results. If you can't work out how to find anything, there will be no results to serialize. :-) Every SPARQL end-point exposes differently-structured RDF, so the first job a newcomer has is to try to work out what classes and properties are in there, and how they relate to each other.

So it would be good to have a guided query builder, which starts off "I want to find ..." with a drop-down list of classes (possibly complete, possibly a selection of the 'key' ones). You select a class, and a second line pops up with a list of properties for that class. Select one, and you get a text box with autocomplete to type a value into, etc. So the query builder is itself using SPARQL queries to provide context-relevant options for the searcher. When they have the result they want, the system can give them the SPARQL query which generated their result, for future reference and/or hand-editing. But they don't actually /have /to write any SPARQL themselves.

Richard

-- *Richard Light*

Bene*

10:55 a.m.

Hey,

maybe fixing this tasks would help new users to create their own SPARQL queries: https://phabricator.wikimedia.org/T101693

Best regards Bene

Am 28.08.2015 um 10:45 schrieb Richard Light:

...

On 27/08/2015 23:37, Stas Malyshev wrote:

...
As this is just starting the SPARQL story, it would be nice to see suggestions about how we could format the output better... Maybe some export options in the GUI. JSON has all the data, but some processing is required to get CONSTRUCT produce some useful RDF serialization. At least for now.

I think help with query building is at least as important as serialization of results. If you can't work out how to find anything, there will be no results to serialize. :-) Every SPARQL end-point exposes differently-structured RDF, so the first job a newcomer has is to try to work out what classes and properties are in there, and how they relate to each other.

So it would be good to have a guided query builder, which starts off "I want to find ..." with a drop-down list of classes (possibly complete, possibly a selection of the 'key' ones). You select a class, and a second line pops up with a list of properties for that class. Select one, and you get a text box with autocomplete to type a value into, etc. So the query builder is itself using SPARQL queries to provide context-relevant options for the searcher. When they have the result they want, the system can give them the SPARQL query which generated their result, for future reference and/or hand-editing. But they don't actually /have /to write any SPARQL themselves.

Richard

*Richard Light*

Wikidata-tech mailing list Wikidata-tech@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-tech

Daniel Kinzler

27 Aug 27 Aug

12:37 p.m.

Hi Richard!

You are right, the Data access documentation did not properly distinguish between concept URI and data URL. I have fixed this now. URLs of the form https://www.wikidata.org/entity/Q3807415.rdf are not canonical at all - we support them for convenience, but they should not be used as identifiers, and should not show up in our output anywhere.

You can see the changes I made to the documentation here: https://www.wikidata.org/w/index.php?title=Wikidata%3AData_access&type=r...

Sorry about the confusion.

-- daniel

Am 27.08.2015 um 09:20 schrieb Richard Light:

...

This described a postfix pattern which delivers RDF XML (e.g. [2]). However, this pattern is itself subject to content negotiation, and an initial 303 response converts the URL to e.g. [3]. I am interested in knowing what pattern of URL will deliver RDF/XML /without /requiring content negotiation, and the answer to that question is not [2] but [3]. This matters, for example, in scenarios where one wants to use XSLT's document() function to retrieve an RDF XML response directly. The URL pattern [2] will fail. So the documentation is currently unhelpful.

[...]

...

[1] https://www.wikidata.org/wiki/Wikidata:Data_access [2] https://www.wikidata.org/entity/Q3807415.rdf [3] https://www.wikidata.org/wiki/Special:EntityData/Q3807415.rdf

-- Daniel Kinzler Senior Software Developer Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V.

Richard Light

12:49 p.m.

Hi Daniel,

Thanks for updating the page. I think, however, you have been a bit over-zealous in changing the URLs. The sentence:

For example, the /concept URI/ of Douglas Adams is http://www.wikidata.org/wiki/Special:EntityData/Q42.

should I think use the concept namespace http://www.wikidata.org/entity/.

No problems about any confusion: I see it as a useful learning exercise. For example, I was also going to comment on your double-redirect strategy, but having checked it in Vapour I then re-read the Cool URIs document and realised that it is an impeccable implementation of one of the strategies described there. So now I know more than I did yesterday. :-)

Best wishes,

Richard

On 27/08/2015 11:37, Daniel Kinzler wrote:

...

Hi Richard!

You are right, the Data access documentation did not properly distinguish between concept URI and data URL. I have fixed this now. URLs of the form https://www.wikidata.org/entity/Q3807415.rdf are not canonical at all - we support them for convenience, but they should not be used as identifiers, and should not show up in our output anywhere.

You can see the changes I made to the documentation here: https://www.wikidata.org/w/index.php?title=Wikidata%3AData_access&type=r...

Sorry about the confusion.

-- daniel

Am 27.08.2015 um 09:20 schrieb Richard Light:

...
This described a postfix pattern which delivers RDF XML (e.g. [2]). However, this pattern is itself subject to content negotiation, and an initial 303 response converts the URL to e.g. [3]. I am interested in knowing what pattern of URL will deliver RDF/XML /without /requiring content negotiation, and the answer to that question is not [2] but [3]. This matters, for example, in scenarios where one wants to use XSLT's document() function to retrieve an RDF XML response directly. The URL pattern [2] will fail. So the documentation is currently unhelpful.

[...]

...
[1] https://www.wikidata.org/wiki/Wikidata:Data_access [2] https://www.wikidata.org/entity/Q3807415.rdf [3] https://www.wikidata.org/wiki/Special:EntityData/Q3807415.rdf

-- *Richard Light*

Lydia Pintscher

1:06 p.m.

On Thu, Aug 27, 2015 at 12:49 PM, Richard Light richard@light.demon.co.uk wrote:

...

Hi Daniel,

Thanks for updating the page. I think, however, you have been a bit over-zealous in changing the URLs. The sentence:

For example, the concept URI of Douglas Adams is http://www.wikidata.org/wiki/Special:EntityData/Q42.

should I think use the concept namespace http://www.wikidata.org/entity/.

Daniel actually had it right in the wikitext source but somehow the translation grabled it. I fixed it now.

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

3376

Age (days ago)

3377

Last active (days ago)

wikidata-tech@lists.wikimedia.org

8 comments

5 participants

tags (0)

participants (5)

Bene*
Daniel Kinzler
Lydia Pintscher
Richard Light
Stas Malyshev