Announcing the release of the Wikidata Query Service

List overview All Threads
Download

newer

older

Item count

Re: [Wikidata] [Commons-l] Trends...

Dan Garry

8 Sep 2015 8 Sep '15

12:29 a.m.

The Discovery Department at the Wikimedia Foundation is pleased to announce the release of the Wikidata Query Service https://www.mediawiki.org/wiki/Wikidata_query_service! You can find the interface for the service at https://query.wikidata.org.

The Wikidata Query Service is designed to let users run queries on the data contained in Wikidata. The service uses SPARQL https://en.wikipedia.org/wiki/SPARQL as the query language. You can see some example queries in the user manual https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual.

Right now, the service is still in beta. This means that our goal https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q2_Goals#Wikidata_Query_Service is to monitor of the service usage and collect feedback about what people think should be next. To do that, we've created the Wikidata Query Service dashboard https://searchdata.wmflabs.org/wdqs/ to track usage of the service, and we're in the process https://phabricator.wikimedia.org/T111403 of setting up a feedback mechanism for users of the service. Once we've got monitored the usage of the service for a while and got user feedback, we'll decide on what's next for development of the service.

If you have any feedback, suggestions, or comments, please do send an email to the Discovery Department's public mailing list, wikimedia-search@lists.wikimedia.org.

Thanks, Dan

-- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation

Attachments:

attachment.htm (text/html — 2.1 KB)

Show replies by date

Gerard Meijssen

8 Sep 8 Sep

8:01 a.m.

Hoi, Wonderful to learn that we finally have progressed towards a live query system. Is it your intention that tools will use this service and, do you hope / anticipate that the tools by Magnus will move towards this new system ?

I am particularly looking forward to the tool that builds a query.. The examples as provided proved really important for me to start using these tools. I really hope for a similar service for the new service.

The documentation is only about the test environment. It mentions Wikigrok. Does the implementation mean that it now runs on the live data... Thanks, GerardM

On 8 September 2015 at 00:29, Dan Garry dgarry@wikimedia.org wrote:

...

The Discovery Department at the Wikimedia Foundation is pleased to announce the release of the Wikidata Query Service https://www.mediawiki.org/wiki/Wikidata_query_service! You can find the interface for the service at https://query.wikidata.org.

The Wikidata Query Service is designed to let users run queries on the data contained in Wikidata. The service uses SPARQL https://en.wikipedia.org/wiki/SPARQL as the query language. You can see some example queries in the user manual https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual.

Right now, the service is still in beta. This means that our goal https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q2_Goals#Wikidata_Query_Service is to monitor of the service usage and collect feedback about what people think should be next. To do that, we've created the Wikidata Query Service dashboard https://searchdata.wmflabs.org/wdqs/ to track usage of the service, and we're in the process https://phabricator.wikimedia.org/T111403 of setting up a feedback mechanism for users of the service. Once we've got monitored the usage of the service for a while and got user feedback, we'll decide on what's next for development of the service.

If you have any feedback, suggestions, or comments, please do send an email to the Discovery Department's public mailing list, wikimedia-search@lists.wikimedia.org.

Thanks, Dan

-- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Stas Malyshev

8:32 a.m.

Hi!

...

I am particularly looking forward to the tool that builds a query.. The examples as provided proved really important for me to start using these tools. I really hope for a similar service for the new service.

That's where community input/contribution is very welcome :)

...

The documentation is only about the test environment. It mentions Wikigrok. Does the implementation mean that it now runs on the live data...

https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual is about production environment.

It runs on live data, and the synchronization time is on https://query.wikidata.org/ (look for "Last updated").

There may be technical issues that make it fall behind from time to time (I know about them and am working to fix them) so it may be not up-to-the second synchronization yet. Ideally, that's what we're striving to reach, and that's what happens most of the time, but not 100% of the time _yet_.

-- Stas Malyshev smalyshev@wikimedia.org

Nicola Vitucci

2:05 p.m.

Hi all,

...

It runs on live data, and the synchronization time is on https://query.wikidata.org/ (look for "Last updated").

There may be technical issues that make it fall behind from time to time (I know about them and am working to fix them) so it may be not up-to-the second synchronization yet. Ideally, that's what we're striving to reach, and that's what happens most of the time, but not 100% of the time _yet_.

this is great news! How is it kept up-to-date? Do you generate and load incremental updates?

Nicola

Markus Krötzsch

10:13 a.m.

Great, congratulations to getting this deployed! Now we can start developing downstream applications :-)

Markus

P.S. @Gerard: it is not meant to move Magnus's query tools to SPARQL. He could of course do this, if he thinks it is easier for him regarding future maintenance/feature extensions. But since his tools are working well, there is no reason to change the underlying technology. In the end, anything that works is fine. Having other solutions for specific query features can always be a good idea. SPARQL supports very powerful queries (like "find all statements with references based on 'Le Figaro'"). More specialised query services that do not support all of this can achieve higher performance. This is why more specialised solutions such as WDQ can also have advantages. Moreover, no matter which query service is used in the backend, it always makes sense to develop more user interfaces that simplify query construction.

On 08.09.2015 00:29, Dan Garry wrote:

...

The Discovery Department at the Wikimedia Foundation is pleased to announce the release of the Wikidata Query Service https://www.mediawiki.org/wiki/Wikidata_query_service! You can find the interface for the service at https://query.wikidata.org.

The Wikidata Query Service is designed to let users run queries on the data contained in Wikidata. The service uses SPARQL https://en.wikipedia.org/wiki/SPARQL as the query language. You can see some example queries in the user manual https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual.

Right now, the service is still in beta. This means that our goal https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q2_Goals#Wikidata_Query_Service is to monitor of the service usage and collect feedback about what people think should be next. To do that, we've created the Wikidata Query Service dashboard https://searchdata.wmflabs.org/wdqs/ to track usage of the service, and we're in the process https://phabricator.wikimedia.org/T111403 of setting up a feedback mechanism for users of the service. Once we've got monitored the usage of the service for a while and got user feedback, we'll decide on what's next for development of the service.

If you have any feedback, suggestions, or comments, please do send an email to the Discovery Department's public mailing list, wikimedia-search@lists.wikimedia.org mailto:wikimedia-search@lists.wikimedia.org.

Thanks, Dan

-- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Magnus Manske

12:03 p.m.

FWIW, I'd be happy to turn WDQ into a "redirect" that takes WDQ commands, uses the WDQ-to-SPARQL translator, runs SPARQL, and returns the results in WDQ format. THis would be for backwards compatibility only. I hope people will start using SPARQL instead of WDQ where possible, and I will add SPARQL options to my tools where appropriate over time.

As it is, some queries (e.g. large result sets like "all humans without image") do not work on SPARQL yet. I will try to keep WDQ running until all WDQ queries can run as SPARQL as well.

On Tue, Sep 8, 2015 at 9:27 AM Markus Krötzsch < markus@semantic-mediawiki.org> wrote:

...

Great, congratulations to getting this deployed! Now we can start developing downstream applications :-)

Markus

P.S. @Gerard: it is not meant to move Magnus's query tools to SPARQL. He could of course do this, if he thinks it is easier for him regarding future maintenance/feature extensions. But since his tools are working well, there is no reason to change the underlying technology. In the end, anything that works is fine. Having other solutions for specific query features can always be a good idea. SPARQL supports very powerful queries (like "find all statements with references based on 'Le Figaro'"). More specialised query services that do not support all of this can achieve higher performance. This is why more specialised solutions such as WDQ can also have advantages. Moreover, no matter which query service is used in the backend, it always makes sense to develop more user interfaces that simplify query construction.

On 08.09.2015 00:29, Dan Garry wrote:

...
The Discovery Department at the Wikimedia Foundation is pleased to announce the release of the Wikidata Query Service https://www.mediawiki.org/wiki/Wikidata_query_service! You can find the interface for the service at https://query.wikidata.org.

The Wikidata Query Service is designed to let users run queries on the data contained in Wikidata. The service uses SPARQL https://en.wikipedia.org/wiki/SPARQL as the query language. You can see some example queries in the user manual https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual.

Right now, the service is still in beta. This means that our goal <

https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q2_Goals#Wikida... is

...
to monitor of the service usage and collect feedback about what people think should be next. To do that, we've created the Wikidata Query Service dashboard https://searchdata.wmflabs.org/wdqs/ to track usage of the service, and we're in the process https://phabricator.wikimedia.org/T111403 of setting up a feedback mechanism for users of the service. Once we've got monitored the usage of the service for a while and got user feedback, we'll decide on what's next for development of the service.

If you have any feedback, suggestions, or comments, please do send an email to the Discovery Department's public mailing list, wikimedia-search@lists.wikimedia.org mailto:wikimedia-search@lists.wikimedia.org.

Thanks, Dan

-- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Gerard Meijssen

12:36 p.m.

Hoi, Thank you Magnus, this is most welcome news.. it gives a well deserved future for the existing tools and it gives an incentive to optimise the SPARQL results in order for them to be used more.

It is great to learn that there is/will be an application for the new SPARQL service. It is how "the rest of us" will use this new tool. Thanks, GerardM

On 8 September 2015 at 12:03, Magnus Manske magnusmanske@googlemail.com wrote:

...

FWIW, I'd be happy to turn WDQ into a "redirect" that takes WDQ commands, uses the WDQ-to-SPARQL translator, runs SPARQL, and returns the results in WDQ format. THis would be for backwards compatibility only. I hope people will start using SPARQL instead of WDQ where possible, and I will add SPARQL options to my tools where appropriate over time.

As it is, some queries (e.g. large result sets like "all humans without image") do not work on SPARQL yet. I will try to keep WDQ running until all WDQ queries can run as SPARQL as well.

On Tue, Sep 8, 2015 at 9:27 AM Markus Krötzsch < markus@semantic-mediawiki.org> wrote:

...
Great, congratulations to getting this deployed! Now we can start developing downstream applications :-)

Markus

P.S. @Gerard: it is not meant to move Magnus's query tools to SPARQL. He could of course do this, if he thinks it is easier for him regarding future maintenance/feature extensions. But since his tools are working well, there is no reason to change the underlying technology. In the end, anything that works is fine. Having other solutions for specific query features can always be a good idea. SPARQL supports very powerful queries (like "find all statements with references based on 'Le Figaro'"). More specialised query services that do not support all of this can achieve higher performance. This is why more specialised solutions such as WDQ can also have advantages. Moreover, no matter which query service is used in the backend, it always makes sense to develop more user interfaces that simplify query construction.

On 08.09.2015 00:29, Dan Garry wrote:

...
The Discovery Department at the Wikimedia Foundation is pleased to announce the release of the Wikidata Query Service https://www.mediawiki.org/wiki/Wikidata_query_service! You can find the interface for the service at https://query.wikidata.org.

The Wikidata Query Service is designed to let users run queries on the data contained in Wikidata. The service uses SPARQL https://en.wikipedia.org/wiki/SPARQL as the query language. You can see some example queries in the user manual https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual.

Right now, the service is still in beta. This means that our goal <

https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q2_Goals#Wikida... is

...
to monitor of the service usage and collect feedback about what people think should be next. To do that, we've created the Wikidata Query Service dashboard https://searchdata.wmflabs.org/wdqs/ to track usage of the service, and we're in the process https://phabricator.wikimedia.org/T111403 of setting up a feedback mechanism for users of the service. Once we've got monitored the usage of the service for a while and got user feedback, we'll decide on what's next for development of the service.

If you have any feedback, suggestions, or comments, please do send an email to the Discovery Department's public mailing list, wikimedia-search@lists.wikimedia.org mailto:wikimedia-search@lists.wikimedia.org.

Thanks, Dan

-- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

James Heald

4:40 p.m.

It may not be time to retire WDQ quite just yet...

Further to what Markus wrote earlier, it does seem that there are still some queries that are a *lot* faster on WDQ than on this initial release of WQS.

For example, as described on Project Chat here,

https://www.wikidata.org/wiki/Wikidata:Project_chat#Wikidata_Query_Service I found that a query to count the number of classes in a particular class tree took 30 times longer to execute on the SPARQL service than with WDQ.

As a result, a lot of queries -- particularly ones which involve recursive extraction as performed by eg the TREE command in WDQ -- look as if they may very easily run in to the 30-second time limit for queries on the new SPARQL service. (For example, I wasn't able to count the items that were instances of classes in the above class tree using SPARQL).

But this isn't to take away from a fabulous achievement by the team getting this fantastic new service up and running -- a service which will presumably only get quicker and quicker as it gets more and more optimised and more and more scaled up. So a very great thanks to all concerned!

All best,

James.

On 08/09/2015 11:03, Magnus Manske wrote:

...

FWIW, I'd be happy to turn WDQ into a "redirect" that takes WDQ commands, uses the WDQ-to-SPARQL translator, runs SPARQL, and returns the results in WDQ format. THis would be for backwards compatibility only. I hope people will start using SPARQL instead of WDQ where possible, and I will add SPARQL options to my tools where appropriate over time.

As it is, some queries (e.g. large result sets like "all humans without image") do not work on SPARQL yet. I will try to keep WDQ running until all WDQ queries can run as SPARQL as well.

On Tue, Sep 8, 2015 at 9:27 AM Markus Krötzsch < markus@semantic-mediawiki.org> wrote:

...
Great, congratulations to getting this deployed! Now we can start developing downstream applications :-)

Markus

P.S. @Gerard: it is not meant to move Magnus's query tools to SPARQL. He could of course do this, if he thinks it is easier for him regarding future maintenance/feature extensions. But since his tools are working well, there is no reason to change the underlying technology. In the end, anything that works is fine. Having other solutions for specific query features can always be a good idea. SPARQL supports very powerful queries (like "find all statements with references based on 'Le Figaro'"). More specialised query services that do not support all of this can achieve higher performance. This is why more specialised solutions such as WDQ can also have advantages. Moreover, no matter which query service is used in the backend, it always makes sense to develop more user interfaces that simplify query construction.

On 08.09.2015 00:29, Dan Garry wrote:

...
The Discovery Department at the Wikimedia Foundation is pleased to announce the release of the Wikidata Query Service https://www.mediawiki.org/wiki/Wikidata_query_service! You can find the interface for the service at https://query.wikidata.org.

The Wikidata Query Service is designed to let users run queries on the data contained in Wikidata. The service uses SPARQL https://en.wikipedia.org/wiki/SPARQL as the query language. You can see some example queries in the user manual https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual.

Right now, the service is still in beta. This means that our goal <

https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q2_Goals#Wikida... is

...
to monitor of the service usage and collect feedback about what people think should be next. To do that, we've created the Wikidata Query Service dashboard https://searchdata.wmflabs.org/wdqs/ to track usage of the service, and we're in the process https://phabricator.wikimedia.org/T111403 of setting up a feedback mechanism for users of the service. Once we've got monitored the usage of the service for a while and got user feedback, we'll decide on what's next for development of the service.

If you have any feedback, suggestions, or comments, please do send an email to the Discovery Department's public mailing list, wikimedia-search@lists.wikimedia.org mailto:wikimedia-search@lists.wikimedia.org.

Thanks, Dan

-- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation

Markus Krötzsch

5:15 p.m.

On 08.09.2015 16:40, James Heald wrote:

...

It may not be time to retire WDQ quite just yet...

Further to what Markus wrote earlier, it does seem that there are still some queries that are a *lot* faster on WDQ than on this initial release of WQS.

For example, as described on Project Chat here,

https://www.wikidata.org/wiki/Wikidata:Project_chat#Wikidata_Query_Service I found that a query to count the number of classes in a particular class tree took 30 times longer to execute on the SPARQL service than with WDQ.

As a result, a lot of queries -- particularly ones which involve recursive extraction as performed by eg the TREE command in WDQ -- look as if they may very easily run in to the 30-second time limit for queries on the new SPARQL service. (For example, I wasn't able to count the items that were instances of classes in the above class tree using SPARQL).

Yes, path queries (called TREE queries in WDQ) are usually faster in WDQ. I think that WDQ is better optimised for this type of queries. This is also what I had in mind with what I wrote: if you narrow down your query language to specific use cases and (possibly) a subset of the data, then you may be able to achieve a better performance in return. There is always a trade-off there. SPARQL is rather complex (if you look at the query examples page, you get an idea of the possibilities), but there is a price to pay for this. I still hope that path queries in particular can be made faster in the future (it still is a rather recent SPARQL feature and I am sure BlazeGraph are continuously working on improving their code).

Markus

...

But this isn't to take away from a fabulous achievement by the team getting this fantastic new service up and running -- a service which will presumably only get quicker and quicker as it gets more and more optimised and more and more scaled up. So a very great thanks to all concerned!

All best,
James.
On 08/09/2015 11:03, Magnus Manske wrote:

...
FWIW, I'd be happy to turn WDQ into a "redirect" that takes WDQ commands, uses the WDQ-to-SPARQL translator, runs SPARQL, and returns the results in WDQ format. THis would be for backwards compatibility only. I hope people will start using SPARQL instead of WDQ where possible, and I will add SPARQL options to my tools where appropriate over time.

As it is, some queries (e.g. large result sets like "all humans without image") do not work on SPARQL yet. I will try to keep WDQ running until all WDQ queries can run as SPARQL as well.

On Tue, Sep 8, 2015 at 9:27 AM Markus Krötzsch < markus@semantic-mediawiki.org> wrote:

...
Great, congratulations to getting this deployed! Now we can start developing downstream applications :-)

Markus

P.S. @Gerard: it is not meant to move Magnus's query tools to SPARQL. He could of course do this, if he thinks it is easier for him regarding future maintenance/feature extensions. But since his tools are working well, there is no reason to change the underlying technology. In the end, anything that works is fine. Having other solutions for specific query features can always be a good idea. SPARQL supports very powerful queries (like "find all statements with references based on 'Le Figaro'"). More specialised query services that do not support all of this can achieve higher performance. This is why more specialised solutions such as WDQ can also have advantages. Moreover, no matter which query service is used in the backend, it always makes sense to develop more user interfaces that simplify query construction.

On 08.09.2015 00:29, Dan Garry wrote:

...
The Discovery Department at the Wikimedia Foundation is pleased to announce the release of the Wikidata Query Service https://www.mediawiki.org/wiki/Wikidata_query_service! You can find the interface for the service at https://query.wikidata.org.

The Wikidata Query Service is designed to let users run queries on the data contained in Wikidata. The service uses SPARQL https://en.wikipedia.org/wiki/SPARQL as the query language. You can see some example queries in the user manual https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual.

Right now, the service is still in beta. This means that our goal <

https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q2_Goals#Wikida...

is

...
to monitor of the service usage and collect feedback about what people think should be next. To do that, we've created the Wikidata Query Service dashboard https://searchdata.wmflabs.org/wdqs/ to track usage of the service, and we're in the process https://phabricator.wikimedia.org/T111403 of setting up a feedback mechanism for users of the service. Once we've got monitored the usage of the service for a while and got user feedback, we'll decide on what's next for development of the service.

If you have any feedback, suggestions, or comments, please do send an email to the Discovery Department's public mailing list, wikimedia-search@lists.wikimedia.org mailto:wikimedia-search@lists.wikimedia.org.

Thanks, Dan

-- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

James Heald

5:58 p.m.

On 08/09/2015 16:15, Markus Krötzsch wrote:

...

Yes, path queries (called TREE queries in WDQ) are usually faster in WDQ. I think that WDQ is better optimised for this type of queries. This is also what I had in mind with what I wrote: if you narrow down your query language to specific use cases and (possibly) a subset of the data, then you may be able to achieve a better performance in return. There is always a trade-off there. SPARQL is rather complex (if you look at the query examples page, you get an idea of the possibilities), but there is a price to pay for this. I still hope that path queries in particular can be made faster in the future (it still is a rather recent SPARQL feature and I am sure BlazeGraph are continuously working on improving their code).

Markus

Thanks, Markus.

Path queries are pretty important for Wikidata, though, because the way Wikidata is constructed in practice you almost never want to query for an instance (P31) of a class -- you will almost always want to include its subclasses too.

Another query I tried and gave trouble was when somebody asked how to find (or even count) all statements referenced to http://www.lefigaro.fr/... -- see https://www.wikidata.org/wiki/Wikidata:Project_chat#WQS:_Searching_for_items...

It may be that there's a better solution than my newbie attempt at http://tinyurl.com/pxlrkd7 -- but on the face of it, it looks as if WQS is timing out trying to make a list of all URLs that are the target of a P854 in a reference, and falling over trying.

But perhaps people with more SPARQL experience than my approximately 24 hours may be able to suggest a better way round this?

All best,

James.

Benjamin Good

6:06 p.m.

Chiming in my excitement that the sparql service is 'officially' up!

As suggested somewhere far above, it would be great for the community to catalogue the queries that are most important for their use cases that do not do well on the SPARQL endpoint. Its likely that the list isn't going to be super-long (in terms of query structure), hence it might make sense to establish dedicated, optimized web services (that exist apart from the endpoint) to call upon when those kinds of queries need to be executed.

-Ben

On Tue, Sep 8, 2015 at 8:58 AM, James Heald j.heald@ucl.ac.uk wrote:

...

On 08/09/2015 16:15, Markus Krötzsch wrote:

...
Yes, path queries (called TREE queries in WDQ) are usually faster in WDQ. I think that WDQ is better optimised for this type of queries. This is also what I had in mind with what I wrote: if you narrow down your query language to specific use cases and (possibly) a subset of the data, then you may be able to achieve a better performance in return. There is always a trade-off there. SPARQL is rather complex (if you look at the query examples page, you get an idea of the possibilities), but there is a price to pay for this. I still hope that path queries in particular can be made faster in the future (it still is a rather recent SPARQL feature and I am sure BlazeGraph are continuously working on improving their code).

Markus

Thanks, Markus.

Path queries are pretty important for Wikidata, though, because the way Wikidata is constructed in practice you almost never want to query for an instance (P31) of a class -- you will almost always want to include its subclasses too.

Another query I tried and gave trouble was when somebody asked how to find (or even count) all statements referenced to http://www.lefigaro.fr/... -- see https://www.wikidata.org/wiki/Wikidata:Project_chat#WQS:_Searching_for_items...

It may be that there's a better solution than my newbie attempt at http://tinyurl.com/pxlrkd7 -- but on the face of it, it looks as if WQS is timing out trying to make a list of all URLs that are the target of a P854 in a reference, and falling over trying.

But perhaps people with more SPARQL experience than my approximately 24 hours may be able to suggest a better way round this?

All best,

James.

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Markus Krötzsch

6:15 p.m.

On 08.09.2015 18:06, Benjamin Good wrote:

...

Chiming in my excitement that the sparql service is 'officially' up!

As suggested somewhere far above, it would be great for the community to catalogue the queries that are most important for their use cases that do not do well on the SPARQL endpoint. Its likely that the list isn't going to be super-long (in terms of query structure), hence it might make sense to establish dedicated, optimized web services (that exist apart from the endpoint) to call upon when those kinds of queries need to be executed.

+1 This would be extremely useful to have! In fact, this would also help to make the SPARQL endpoint as such faster, since it could be used by the developers to find out what to optimise for. You can always make one type of query faster; you just need users to tell you what is needed most.

Markus

...

On Tue, Sep 8, 2015 at 8:58 AM, James Heald <j.heald@ucl.ac.uk mailto:j.heald@ucl.ac.uk> wrote:

On 08/09/2015 16:15, Markus Krötzsch wrote:


    Yes, path queries (called TREE queries in WDQ) are usually faster in
    WDQ. I think that WDQ is better optimised for this type of
    queries. This
    is also what I had in mind with what I wrote: if you narrow down
    your
    query language to specific use cases and (possibly) a subset of the
    data, then you may be able to achieve a better performance in
    return.
    There is always a trade-off there. SPARQL is rather complex (if
    you look
    at the query examples page, you get an idea of the
    possibilities), but
    there is a price to pay for this. I still hope that path queries in
    particular can be made faster in the future (it still is a
    rather recent
    SPARQL feature and I am sure BlazeGraph are continuously working on
    improving their code).

    Markus


Thanks, Markus.

Path queries are pretty important for Wikidata, though, because the
way Wikidata is constructed in practice you almost never want to
query for an instance (P31) of a class -- you will almost always
want to include its subclasses too.

Another query I tried and gave trouble was when somebody asked how
to find (or even count) all statements referenced to
http://www.lefigaro.fr/...
-- see
https://www.wikidata.org/wiki/Wikidata:Project_chat#WQS:_Searching_for_items_with_reference_to_.22Le_Figaro.22


It may be that there's a better solution than my newbie attempt at
http://tinyurl.com/pxlrkd7
-- but on the face of it, it looks as if WQS is timing out trying to
make a list of all URLs that are the target of a P854 in a
reference, and falling over trying.

But perhaps people with more SPARQL experience than my approximately
24 hours may be able to suggest a better way round this?

All best,

   James.



_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Lydia Pintscher

6:17 p.m.

On Tue, Sep 8, 2015 at 6:15 PM, Markus Krötzsch markus@semantic-mediawiki.org wrote:

...

On 08.09.2015 18:06, Benjamin Good wrote:

...
Chiming in my excitement that the sparql service is 'officially' up!

As suggested somewhere far above, it would be great for the community to catalogue the queries that are most important for their use cases that do not do well on the SPARQL endpoint. Its likely that the list isn't going to be super-long (in terms of query structure), hence it might make sense to establish dedicated, optimized web services (that exist apart from the endpoint) to call upon when those kinds of queries need to be executed.

+1 This would be extremely useful to have! In fact, this would also help to make the SPARQL endpoint as such faster, since it could be used by the developers to find out what to optimise for. You can always make one type of query faster; you just need users to tell you what is needed most.

Agreed. Dan: Will you create a page on-wiki to collect these or should I?

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V. Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Stas Malyshev

9:51 p.m.

Hi!

...

As suggested somewhere far above, it would be great for the community to catalogue the queries that are most important for their use cases that do not do well on the SPARQL endpoint. Its likely that the list isn't going to be super-long (in terms of query structure), hence it might make sense to establish dedicated, optimized web services (that exist apart from the endpoint) to call upon when those kinds of queries need to be executed.

Good idea. As a preliminary point, I think the same basics as with most other query engines (SQL, etc.) apply:

- non-restrictive queries with tons of results will be slow - i.e. "list of all humans" is probably not a good question to ask :)

- negative searches are usually slower - e.g. "all humans without images" will be slow, since that query would have to inspect records for every human

- Unbound paths/traversals will usually be slower (unfortunately many queries that have TREE in WDQ are those), especially if there are a lot of starting points for traversals (again, "all humans that", etc...)

It is also a good idea to put LIMIT on queries when experimenting, i.e. if you intended to write query that asks for 10 records but accidentally wrote one that returns 10 million, it's much nicer to discover it with suitable limit than waiting for the query to time out and then try to figure out why it happened.

Yes, I realize all this has to go to some page in the manual eventually :)

-- Stas Malyshev smalyshev@wikimedia.org

Andra Waagmeester

1:21 p.m.

Great news! I have been using its predecessor on http://wdqs-beta.wmflabs.org/ for some time now, and can only say that I am quite enthusiastic. Is query.wikidata.org a continuation of the wdqs on wmf labs? I really like how the wdqs is working quite well as an interface to R through its SPARQL plugin or in federated queries submitted from external SPARQL endpoints!

Thanks for the good work.

On Tue, Sep 8, 2015 at 12:29 AM, Dan Garry dgarry@wikimedia.org wrote:

...

The Discovery Department at the Wikimedia Foundation is pleased to announce the release of the Wikidata Query Service https://www.mediawiki.org/wiki/Wikidata_query_service! You can find the interface for the service at https://query.wikidata.org.

The Wikidata Query Service is designed to let users run queries on the data contained in Wikidata. The service uses SPARQL https://en.wikipedia.org/wiki/SPARQL as the query language. You can see some example queries in the user manual https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual.

Right now, the service is still in beta. This means that our goal https://www.mediawiki.org/wiki/Wikimedia_Engineering/2015-16_Q2_Goals#Wikidata_Query_Service is to monitor of the service usage and collect feedback about what people think should be next. To do that, we've created the Wikidata Query Service dashboard https://searchdata.wmflabs.org/wdqs/ to track usage of the service, and we're in the process https://phabricator.wikimedia.org/T111403 of setting up a feedback mechanism for users of the service. Once we've got monitored the usage of the service for a while and got user feedback, we'll decide on what's next for development of the service.

If you have any feedback, suggestions, or comments, please do send an email to the Discovery Department's public mailing list, wikimedia-search@lists.wikimedia.org.

Thanks, Dan

-- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Lydia Pintscher

6:10 p.m.

On Tue, Sep 8, 2015 at 1:21 PM, Andra Waagmeester andra@micelio.be wrote:

...

Great news! I have been using its predecessor on http://wdqs-beta.wmflabs.org/ for some time now, and can only say that I am quite enthusiastic. Is query.wikidata.org a continuation of the wdqs on wmf labs? I really like how the wdqs is working quite well as an interface to R through its SPARQL plugin or in federated queries submitted from external SPARQL endpoints!

Yes it is the continuation of the beta on labs. Stas: Do you want to turn that into a redirect now?

Cheers Lydia

Andra Waagmeester

7:28 p.m.

lydia,

Before turning that it into a redirect, it might be worth looking at the content. There seems to be some discrepancy between the results from http://wdqs-beta.wmflabs.org/ and those from http://query.wikidata.org. When submitted to both endpoints the following query returns different results:

PREFIX wd: http://www.wikidata.org/entity/ PREFIX wdt: http://www.wikidata.org/prop/direct/

SELECT * WHERE { ?diseases wdt:P699 ?doid . }

the results from http://query.wikidata.org contains some erroneous statements [1] (i.e. Disease Ontology IDs without a prefix) the results from the wdqs-beta on the same query seems to be more accurate [2]

[1] http://tinyurl.com/nz5lo9d [2] http://tinyurl.com/nudtks2 .

On Tue, Sep 8, 2015 at 6:10 PM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:

...

On Tue, Sep 8, 2015 at 1:21 PM, Andra Waagmeester andra@micelio.be wrote:

...
Great news! I have been using its predecessor on http://wdqs-beta.wmflabs.org/ for some time now, and can only say that

I am

...
quite enthusiastic. Is query.wikidata.org a continuation of the wdqs on

wmf

...
labs? I really like how the wdqs is working quite well as an interface

to R

...
through its SPARQL plugin or in federated queries submitted from external SPARQL endpoints!

Yes it is the continuation of the beta on labs. Stas: Do you want to turn that into a redirect now?

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata

Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Stas Malyshev

9:17 p.m.

Hi!

...

Before turning that it into a redirect, it might be worth looking at
the content. There seems to be some discrepancy between the results from http://wdqs-beta.wmflabs.org/ and those from http://query.wikidata.org. When submitted to both endpoints the following query returns different results:

These are fed from the same data and supposed to be the same code, so I wonder how it happened there's a difference... The query actually shows there's two values for Q181391 on P699 - 987 and DOID:987.

I'll investigate that. It may be some kind of bug. Thanks for bringing it to my attention.

-- Stas Malyshev smalyshev@wikimedia.org

Stas Malyshev

9:52 p.m.

Hi!

...

Yes it is the continuation of the beta on labs. Stas: Do you want to turn that into a redirect now?

Not sure yet what to do with it. I want to keep the labs setup for continued development work, especially when potentially breaking things, but we do want to redirect most of the people now to main endpoint as it is much better at handling the load.

-- Stas Malyshev smalyshev@wikimedia.org

Lydia Pintscher

9:55 p.m.

On Tue, Sep 8, 2015 at 9:52 PM, Stas Malyshev smalyshev@wikimedia.org wrote:

...

Hi!

...
Yes it is the continuation of the beta on labs. Stas: Do you want to turn that into a redirect now?

Not sure yet what to do with it. I want to keep the labs setup for continued development work, especially when potentially breaking things, but we do want to redirect most of the people now to main endpoint as it is much better at handling the load.

Fair enough. A big blinking banner at the top of the page might do then?

Cheers Lydia

Andra Waagmeester

10:03 p.m.

I think a blinking banner would work on the main side. However, you would miss people like me, who use the service mainly from within R or my text editor (Text mate with turtle bundle). I wouldn't see a blinking banner.

On Tue, Sep 8, 2015 at 9:55 PM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:

...

On Tue, Sep 8, 2015 at 9:52 PM, Stas Malyshev smalyshev@wikimedia.org wrote:

...
Hi!

...
Yes it is the continuation of the beta on labs. Stas: Do you want to turn that into a redirect now?

Not sure yet what to do with it. I want to keep the labs setup for continued development work, especially when potentially breaking things, but we do want to redirect most of the people now to main endpoint as it is much better at handling the load.

Fair enough. A big blinking banner at the top of the page might do then?

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata

Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Denny Vrandečić

10:19 p.m.

Is there a write up of how you are using Wikidata from R? That sounds quite cool.

On Tue, Sep 8, 2015, 13:03 Andra Waagmeester andra@micelio.be wrote:

...

I think a blinking banner would work on the main side. However, you would miss people like me, who use the service mainly from within R or my text editor (Text mate with turtle bundle). I wouldn't see a blinking banner.

On Tue, Sep 8, 2015 at 9:55 PM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:

...
On Tue, Sep 8, 2015 at 9:52 PM, Stas Malyshev smalyshev@wikimedia.org wrote:

...
Hi!

...
Yes it is the continuation of the beta on labs. Stas: Do you want to turn that into a redirect now?

Not sure yet what to do with it. I want to keep the labs setup for continued development work, especially when potentially breaking things, but we do want to redirect most of the people now to main endpoint as it is much better at handling the load.

Fair enough. A big blinking banner at the top of the page might do then?

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata

Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Andra Waagmeester

10:38 p.m.

Hi Denny,

The following R script ( https://gist.github.com/andrawaag/2b8c831ab4dd70b16cf2) plots wikidata content on a worldmap in R.

Andra

On Tue, Sep 8, 2015 at 10:19 PM, Denny Vrandečić vrandecic@gmail.com wrote:

...

Is there a write up of how you are using Wikidata from R? That sounds quite cool.

On Tue, Sep 8, 2015, 13:03 Andra Waagmeester andra@micelio.be wrote:

...
I think a blinking banner would work on the main side. However, you would miss people like me, who use the service mainly from within R or my text editor (Text mate with turtle bundle). I wouldn't see a blinking banner.

On Tue, Sep 8, 2015 at 9:55 PM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:

...
On Tue, Sep 8, 2015 at 9:52 PM, Stas Malyshev smalyshev@wikimedia.org wrote:

...
Hi!

...
Yes it is the continuation of the beta on labs. Stas: Do you want to turn that into a redirect now?

Not sure yet what to do with it. I want to keep the labs setup for continued development work, especially when potentially breaking

things,

...
but we do want to redirect most of the people now to main endpoint as

it

...
is much better at handling the load.

Fair enough. A big blinking banner at the top of the page might do then?

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata

Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Denny Vrandečić

11:56 p.m.

Anyone an idea why this query has a trouble when I add the OPTIONAL keyword?

*http://tinyurl.com/pgsujp2 http://tinyurl.com/pgsujp2*

Doesn't look much harder than the queries in the examples.

On Tue, Sep 8, 2015 at 1:39 PM Andra Waagmeester andra@micelio.be wrote:

...

Hi Denny,

The following R script ( https://gist.github.com/andrawaag/2b8c831ab4dd70b16cf2) plots wikidata content on a worldmap in R.

Andra

On Tue, Sep 8, 2015 at 10:19 PM, Denny Vrandečić vrandecic@gmail.com wrote:

...
Is there a write up of how you are using Wikidata from R? That sounds quite cool.

On Tue, Sep 8, 2015, 13:03 Andra Waagmeester andra@micelio.be wrote:

...
I think a blinking banner would work on the main side. However, you would miss people like me, who use the service mainly from within R or my text editor (Text mate with turtle bundle). I wouldn't see a blinking banner.

On Tue, Sep 8, 2015 at 9:55 PM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:

...
On Tue, Sep 8, 2015 at 9:52 PM, Stas Malyshev smalyshev@wikimedia.org wrote:

...
Hi!

...
Yes it is the continuation of the beta on labs. Stas: Do you want to turn that into a redirect now?

Not sure yet what to do with it. I want to keep the labs setup for continued development work, especially when potentially breaking

things,

...
but we do want to redirect most of the people now to main endpoint as

it

...
is much better at handling the load.

Fair enough. A big blinking banner at the top of the page might do then?

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata

Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Markus Krötzsch

9 Sep 9 Sep

12:07 a.m.

On 08.09.2015 23:56, Denny Vrandečić wrote:

...

Anyone an idea why this query has a trouble when I add the OPTIONAL keyword?

*http://tinyurl.com/pgsujp2*

Doesn't look much harder than the queries in the examples.

Looking at the bottom of the exceptions, you can see that the system simply refuses to run the query. The problem seems to be with the labelling service. If you don't select the label for ?head, it works fine:

http://tinyurl.com/p6rfpgv

It seems the implementation of the label service needs some improvement to support unbound variables (which should then return unbound labels, rather than throw a runtime exception ;-).

Markus

...

On Tue, Sep 8, 2015 at 1:39 PM Andra Waagmeester <andra@micelio.be mailto:andra@micelio.be> wrote:

Hi Denny,

The following R script
  (https://gist.github.com/andrawaag/2b8c831ab4dd70b16cf2) plots
wikidata content on a worldmap in R.


Andra


On Tue, Sep 8, 2015 at 10:19 PM, Denny Vrandečić
<vrandecic@gmail.com <mailto:vrandecic@gmail.com>> wrote:

    Is there a write up of how you are using Wikidata from R? That
    sounds quite cool.


    On Tue, Sep 8, 2015, 13:03 Andra Waagmeester <andra@micelio.be
    <mailto:andra@micelio.be>> wrote:

        I think a blinking banner would work on the main side.
        However, you would miss people like me, who use the service
        mainly from within R or my text editor (Text mate with
        turtle bundle). I wouldn't see a blinking banner.


        On Tue, Sep 8, 2015 at 9:55 PM, Lydia Pintscher
        <lydia.pintscher@wikimedia.de
        <mailto:lydia.pintscher@wikimedia.de>> wrote:

            On Tue, Sep 8, 2015 at 9:52 PM, Stas Malyshev
            <smalyshev@wikimedia.org
            <mailto:smalyshev@wikimedia.org>> wrote:
            > Hi!
            >
            >> Yes it is the continuation of the beta on labs.
            >> Stas: Do you want to turn that into a redirect now?
            >
            > Not sure yet what to do with it. I want to keep the labs setup for
            > continued development work, especially when potentially breaking things,
            > but we do want to redirect most of the people now to main endpoint as it
            > is much better at handling the load.

            Fair enough. A big blinking banner at the top of the
            page might do then?


            Cheers
            Lydia

            --
            Lydia Pintscher - http://about.me/lydia.pintscher
            Product Manager for Wikidata

            Wikimedia Deutschland e.V.
            Tempelhofer Ufer 23-24
            10963 Berlin
            www.wikimedia.de <http://www.wikimedia.de>

            Wikimedia Deutschland - Gesellschaft zur Förderung
            Freien Wissens e. V.

            Eingetragen im Vereinsregister des Amtsgerichts
            Berlin-Charlottenburg
            unter der Nummer 23855 Nz. Als gemeinnützig anerkannt
            durch das
            Finanzamt für Körperschaften I Berlin, Steuernummer
            27/681/51985 <tel:27%2F681%2F51985>.

            _______________________________________________
            Wikidata mailing list
            Wikidata@lists.wikimedia.org
            <mailto:Wikidata@lists.wikimedia.org>
            https://lists.wikimedia.org/mailman/listinfo/wikidata


        _______________________________________________
        Wikidata mailing list
        Wikidata@lists.wikimedia.org
        <mailto:Wikidata@lists.wikimedia.org>
        https://lists.wikimedia.org/mailman/listinfo/wikidata


    _______________________________________________
    Wikidata mailing list
    Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
    https://lists.wikimedia.org/mailman/listinfo/wikidata


_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org <mailto:Wikidata@lists.wikimedia.org>
https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Markus Krötzsch

12:13 a.m.

On 09.09.2015 00:07, Markus Krötzsch wrote:

...

On 08.09.2015 23:56, Denny Vrandečić wrote:

...
Anyone an idea why this query has a trouble when I add the OPTIONAL keyword?

*http://tinyurl.com/pgsujp2*

Doesn't look much harder than the queries in the examples.

Looking at the bottom of the exceptions, you can see that the system simply refuses to run the query. The problem seems to be with the labelling service. If you don't select the label for ?head, it works fine:

http://tinyurl.com/p6rfpgv

It seems the implementation of the label service needs some improvement to support unbound variables (which should then return unbound labels, rather than throw a runtime exception ;-).

P.S. I am not convinced yet of this non-standard extension of SPARQL to fetch labels. Its behaviour based on the variables given in SELECT seems to contradict SPARQL (where SELECT is applied to the results of a query *after* they are computed, without having any direct influence on the meaning of the WHERE part).

A simple UI that would query the Web API on demand might be better, and not put any load on the query service. Considering that the main query service is not the W3C conforming SPARQL endpoint (which gives you raw results), one could as well have some JavaScript there to beautify the resulting item URIs based on client-side requests. Maybe some consumers really need to get labels from SPARQL, but at least the users who want to see results right away would not need this.

Markus

Nicola Vitucci

12:32 a.m.

Il 09/09/2015 00:13, Markus Krötzsch ha scritto:

...

On 09.09.2015 00:07, Markus Krötzsch wrote:

...
On 08.09.2015 23:56, Denny Vrandečić wrote:

...
Anyone an idea why this query has a trouble when I add the OPTIONAL keyword?

*http://tinyurl.com/pgsujp2*

Doesn't look much harder than the queries in the examples.

Looking at the bottom of the exceptions, you can see that the system simply refuses to run the query. The problem seems to be with the labelling service. If you don't select the label for ?head, it works fine:

http://tinyurl.com/p6rfpgv

It seems the implementation of the label service needs some improvement to support unbound variables (which should then return unbound labels, rather than throw a runtime exception ;-).

P.S. I am not convinced yet of this non-standard extension of SPARQL to fetch labels. Its behaviour based on the variables given in SELECT seems to contradict SPARQL (where SELECT is applied to the results of a query *after* they are computed, without having any direct influence on the meaning of the WHERE part).

A simple UI that would query the Web API on demand might be better, and not put any load on the query service. Considering that the main query service is not the W3C conforming SPARQL endpoint (which gives you raw results), one could as well have some JavaScript there to beautify the resulting item URIs based on client-side requests. Maybe some consumers really need to get labels from SPARQL, but at least the users who want to see results right away would not need this.

Markus

I agree with Markus here. Also, this solution would make the selection of the "overall" language easier. (I was myself experimenting with this kind of approach, but unfortunately my server is down right now and it will take some time to be brought up again to show it.)

What was the rationale behind this choice?

Nicola

Stas Malyshev

12:45 a.m.

Hi!

...

P.S. I am not convinced yet of this non-standard extension of SPARQL to fetch labels. Its behaviour based on the variables given in SELECT seems

You don't have to use variables in SELECT, it's just a shortcut. You can use it either manually by specifying the clauses in the body of SERVICE (see https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual#Label_serv...) or ignore it completely and fetch the labels manually. Main goals of the service is twofold: 1. Allow people to quickly write queries without having to spell out how to get labels (it can be a bit verbose, see e.g.: https://www.mediawiki.org/wiki/Wikibase/Indexing/SPARQL_Query_Examples#Presi...

) 2. Provide language fallback when asking for labels in languages that may not have labels ready.

If you're just fetching data, you can safely ignore the service and the labels altogether.

...

A simple UI that would query the Web API on demand might be better, and not put any load on the query service. Considering that the main query

Nothing prevents us from creating such UI, but for many purposes having to do extra step to see labels does not seem optimal for me, especially if data is intended for human consumption.

...

results), one could as well have some JavaScript there to beautify the resulting item URIs based on client-side requests. Maybe some consumers

I'm not sure what you mean by "beautify". If you mean to fetch labels, querying labels separately would slow things down significantly.

...

really need to get labels from SPARQL, but at least the users who want to see results right away would not need this.

Then why wouldn't these users just ignore the label service altogether?

-- Stas Malyshev smalyshev@wikimedia.org

Denny Vrandečić

1:05 a.m.

Thanks! The explanations helped a lot.

I am thrilled to see the Query service coming into production! Wohoo!

On Tue, Sep 8, 2015 at 3:46 PM Stas Malyshev smalyshev@wikimedia.org wrote:

...

Hi!

...
P.S. I am not convinced yet of this non-standard extension of SPARQL to fetch labels. Its behaviour based on the variables given in SELECT seems

You don't have to use variables in SELECT, it's just a shortcut. You can use it either manually by specifying the clauses in the body of SERVICE (see

https://www.mediawiki.org/wiki/Wikidata_query_service/User_Manual#Label_serv... ) or ignore it completely and fetch the labels manually. Main goals of the service is twofold:

Allow people to quickly write queries without having to spell out how

to get labels (it can be a bit verbose, see e.g.:

https://www.mediawiki.org/wiki/Wikibase/Indexing/SPARQL_Query_Examples#Presi...

) 2. Provide language fallback when asking for labels in languages that may not have labels ready.

If you're just fetching data, you can safely ignore the service and the labels altogether.

...
A simple UI that would query the Web API on demand might be better, and not put any load on the query service. Considering that the main query

Nothing prevents us from creating such UI, but for many purposes having to do extra step to see labels does not seem optimal for me, especially if data is intended for human consumption.

...
results), one could as well have some JavaScript there to beautify the resulting item URIs based on client-side requests. Maybe some consumers

I'm not sure what you mean by "beautify". If you mean to fetch labels, querying labels separately would slow things down significantly.

...
really need to get labels from SPARQL, but at least the users who want to see results right away would not need this.

Then why wouldn't these users just ignore the label service altogether?

-- Stas Malyshev smalyshev@wikimedia.org

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Markus Krötzsch

10:08 a.m.

Good morning :-)

On 09.09.2015 00:45, Stas Malyshev wrote:

...

Hi!

...
P.S. I am not convinced yet of this non-standard extension of SPARQL to fetch labels. Its behaviour based on the variables given in SELECT seems

You don't have to use variables in SELECT, it's just a shortcut.

What I meant with my comment is that the SPARQL semantics does not allow for extensions that modify the query semantics based on the selected variables. Even if this is optional, it changes some fundamental assumptions about how SPARQL works.

...

Nothing prevents us from creating such UI, but for many purposes having to do extra step to see labels does not seem optimal for me, especially if data is intended for human consumption.

I agree that creating such a UI should not be left to WMF or WMDE developers. The SPARQL web api is there for everybody to use. One could also start from general SPARQL tools such as YASGUI (about.yasgui.org) as a basis for SPARQL editor.

An extra step should not be needed. Users would just use a query page like the one we have now. Only the display of the result table would be modified so that there is a language selector above the table: if a language is selected, all URIs that refer to Wikidata entities will get a suitable label as their anchor text. One could also have a the option to select "no language" where only the item ids are shown.

...

...
results), one could as well have some JavaScript there to beautify the resulting item URIs based on client-side requests. Maybe some consumers

I'm not sure what you mean by "beautify". If you mean to fetch labels, querying labels separately would slow things down significantly.

There should be no slowdown in the SPARQL service, since the labels would be fetched client-side. There would be a short delay between the arrival of the results and the fetching of the labels. We already have similar delays when opening a Wikidata page (site names, for example, take a moment to fetch). Wikidata Query/Autolist also uses this method to fetch labels client-side, and the delay is not too big (it is possible to fetch up to 50 labels in one request).

With beautify I mean that one could do further things, such as having a tooltip when hovering over entities in results that shows more information (or maybe fetches an additional description). That's where people can be creative. I think these kinds of features are best placed in the hands of community members. We can easily have several SPARQL UIs.

...

...
really need to get labels from SPARQL, but at least the users who want to see results right away would not need this.

Then why wouldn't these users just ignore the label service altogether?

Because all examples are using it, and many users are learning SPARQL now from the examples. This is the main reason why I care at all. After all, every SPARQL processor has some built-in extensions that deviate from the standard.

Markus

Nicola Vitucci

13 Sep 13 Sep

11:53 p.m.

Hi all,

...

I agree that creating such a UI should not be left to WMF or WMDE developers. The SPARQL web api is there for everybody to use. One could also start from general SPARQL tools such as YASGUI (about.yasgui.org) as a basis for SPARQL editor.

An extra step should not be needed. Users would just use a query page like the one we have now. Only the display of the result table would be modified so that there is a language selector above the table: if a language is selected, all URIs that refer to Wikidata entities will get a suitable label as their anchor text. One could also have a the option to select "no language" where only the item ids are shown.

as I announced last week I've been doing some experiments which are pretty in line with these ideas. You can try this example:

wikisparql.org/sparql?query=DESCRIBE+<http%3A%2F%2Fwww.wikidata.org%2Fentity%2FQ1>

Also, on the endpoint page I've added a little widget which uses a labelling service (not a SPARQL one) to help finding entities from text search. I think it might help to get started with SPARQL along with example queries.

Let me know if it might be of interest, these are just experiments and can be improved in many ways (and I'm not a front-end developer :-)).

Nicola

Stas Malyshev

9 Sep 9 Sep

12:14 a.m.

Hi!

...

It seems the implementation of the label service needs some improvement to support unbound variables (which should then return unbound labels, rather than throw a runtime exception ;-).

Not sure whether this is possible (need to research it) but if it is, yes, that probably would be the way to fix it.

-- Stas Malyshev smalyshev@wikimedia.org

James Heald

12:07 a.m.

It seems to be that it doesn't like looking up the label for ?head when ?head is undefined.

Without ?headLabel it runs fine:

http://tinyurl.com/p6rfpgv

-- James.

On 08/09/2015 22:56, Denny Vrandečić wrote:

...

Anyone an idea why this query has a trouble when I add the OPTIONAL keyword?

*http://tinyurl.com/pgsujp2 http://tinyurl.com/pgsujp2*

Doesn't look much harder than the queries in the examples.

On Tue, Sep 8, 2015 at 1:39 PM Andra Waagmeester andra@micelio.be wrote:

...
Hi Denny,

The following R script ( https://gist.github.com/andrawaag/2b8c831ab4dd70b16cf2) plots wikidata content on a worldmap in R.

Andra

On Tue, Sep 8, 2015 at 10:19 PM, Denny Vrandečić vrandecic@gmail.com wrote:

...
Is there a write up of how you are using Wikidata from R? That sounds quite cool.

On Tue, Sep 8, 2015, 13:03 Andra Waagmeester andra@micelio.be wrote:

...
I think a blinking banner would work on the main side. However, you would miss people like me, who use the service mainly from within R or my text editor (Text mate with turtle bundle). I wouldn't see a blinking banner.

On Tue, Sep 8, 2015 at 9:55 PM, Lydia Pintscher < lydia.pintscher@wikimedia.de> wrote:

...
On Tue, Sep 8, 2015 at 9:52 PM, Stas Malyshev smalyshev@wikimedia.org wrote:

...
Hi!

> Yes it is the continuation of the beta on labs. > Stas: Do you want to turn that into a redirect now?

Not sure yet what to do with it. I want to keep the labs setup for continued development work, especially when potentially breaking

things,

...
but we do want to redirect most of the people now to main endpoint as

it

...
is much better at handling the load.

Fair enough. A big blinking banner at the top of the page might do then?

Cheers Lydia

-- Lydia Pintscher - http://about.me/lydia.pintscher Product Manager for Wikidata

Wikimedia Deutschland e.V. Tempelhofer Ufer 23-24 10963 Berlin www.wikimedia.de

Wikimedia Deutschland - Gesellschaft zur Förderung Freien Wissens e. V.

Eingetragen im Vereinsregister des Amtsgerichts Berlin-Charlottenburg unter der Nummer 23855 Nz. Als gemeinnützig anerkannt durch das Finanzamt für Körperschaften I Berlin, Steuernummer 27/681/51985.

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata

Stas Malyshev

12:13 a.m.

Hi!

...

Anyone an idea why this query has a trouble when I add the OPTIONAL keyword?

*http://tinyurl.com/pgsujp2*

Doesn't look much harder than the queries in the examples.

It's not because it's harder. It's because ?head can be unbound, and you can not apply label service to unbound variables. If you drop ?headLabel then it works. It is a downside of the label service, not sure yet how to fix it (feel free to submit the Phabricator issue, maybe myself or somebody else has an idea later).

-- Stas Malyshev smalyshev@wikimedia.org

Markus Krötzsch

12:15 a.m.

On 09.09.2015 00:13, Stas Malyshev wrote:

...

Hi!

...
Anyone an idea why this query has a trouble when I add the OPTIONAL keyword?

*http://tinyurl.com/pgsujp2*

Doesn't look much harder than the queries in the examples.

It's not because it's harder. It's because ?head can be unbound, and you can not apply label service to unbound variables. If you drop ?headLabel then it works. It is a downside of the label service, not sure yet how to fix it (feel free to submit the Phabricator issue, maybe myself or somebody else has an idea later).

Why can't the label service just return unbound labels for unbound inputs?

Markus

Markus Krötzsch

8 Sep 8 Sep

10:23 p.m.

On 08.09.2015 21:52, Stas Malyshev wrote:

...

Hi!

...
Yes it is the continuation of the beta on labs. Stas: Do you want to turn that into a redirect now?

Not sure yet what to do with it. I want to keep the labs setup for continued development work, especially when potentially breaking things, but we do want to redirect most of the people now to main endpoint as it is much better at handling the load.

Suggestion: redirect existing labs URLs and use a new URL on labs for future tests.

Markus

3384

Age (days ago)

3390

Last active (days ago)

wikidata@lists.wikimedia.org

35 comments

11 participants

tags (0)

participants (11)

Andra Waagmeester
Benjamin Good
Dan Garry
Denny Vrandečić
Gerard Meijssen
James Heald
Lydia Pintscher
Magnus Manske
Markus Krötzsch
Nicola Vitucci
Stas Malyshev