Dear all,
I need to know about the possibility of making queries on a Wikibase instance. I think it is possible to make queries on data on a particular instance only with external tools at the moment, right?
Thanks for the answer. :)
Il 29/04/2015 20:56, Luca Martinelli ha scritto:
Dear all,
I need to know about the possibility of making queries on a Wikibase instance. I think it is possible to make queries on data on a particular instance only with external tools at the moment, right?
Thanks for the answer. :)
Wikidata-Query-Service https://phabricator.wikimedia.org/project/profile/891/
On 29.04.2015 20:56, Luca Martinelli wrote:
Dear all,
I need to know about the possibility of making queries on a Wikibase instance. I think it is possible to make queries on data on a particular instance only with external tools at the moment, right?
Yes, this is correct. The SPARQL query support that we currently offer is obtained by making an RDF export and loading it into a SPARQL database (we use Virtuoso but you could also use BlazeGraph, for example; both have free and open source versions and are not hard to install overall; if your data is not so large, you could also try Jena; there are further open source RDF databases, but these are the most prominent right now I think).
The RDF export, too, is not currently generated by Wikibase. However, Wikidata Toolkit, which we use to make the RDF dumps, can be used with data from any Wikibase installation in theory. In practice, nobody has asked for this yet and we might have to make a few adjustments to really get it to work in a convenient way. For a start, I don't know what kind of export options a standalone Wikibase offers you at the moment. We can use the usual XML-based page dump if it contains valid JSON for a change (this was not the case for Wikidata last time I checked ...). Better yet would be the JSON exports, but I don't know if you can generate them with vanilla Wikibase or if WMF is using some special tools for this.
Anyway, it can't be too hard to make this work and once it is done you would have a query service that is at the same level as the one of Wikidata. You could even combine data from more than one Wiki in one RDF database, e.g., to run queries over data from both.
Regards,
Markus
Hi!
The RDF export, too, is not currently generated by Wikibase. However,
Beta version for Wikidata is already on https://dumps.wikimedia.org/wikidatawiki/entities/ - however note the word "beta" :) That means it is not stable and in the following month or so will undergo some big changes, including ontology changes and others. However, it shows what export capabilities do.
These capabilities can be used by any Wikibase instance by running maintenance/dumpRdf.php or maintenance/dumpJson.php scripts, but - again - these features are not 100% stable yet AFAIK, especially RDF part that has some missing parts (see https://phabricator.wikimedia.org/T50143), though dumpJson should be safe to use if you want JSON. We're actively working on the RDF part so it will be ready for use soon too.
Hello, Does this also means that the RDF data available via for instance http://www.wikidata.org/entity/Q235382.nt or http://www.wikidata.org/entity/Q235382 could not be queried via SPARQL unless you download the .nt file and use for instance Jena ARQ https://jena.apache.org/documentation/query/index.html on your own computer ?
Does this also means that there is no use to publish RDF data linking to Wikidata like for instance :
@prefix mydata: http://mydata.fr/ . @prefix cidoc: http://www.cidoc-crm.org/cidoc-crm/ . @prefix wikidata: http://www.wikidata.org/entity/
mydata:event/1 a cidoc:E67_Birth ; cidoc:P98_brought_into_life mydata:person/80 ; cidoc:P7_took_place_at wikidata:Q235382 ;
Thanks,
Jean-Baptiste Pressac
Traitement et analyse de bases de données Production et diffusion de corpus numériques
Centre de Recherche Bretonne et Celtique Unité mixte de service (UMS) 3554 20 rue Duquesne CS 93837 29238 Brest cedex 3
tel : +33 (0)2 98 01 68 95 fax : +33 (0)2 98 01 63 93
Le 29/04/2015 21:44, Markus Krötzsch a écrit :
On 29.04.2015 20:56, Luca Martinelli wrote:
Dear all,
I need to know about the possibility of making queries on a Wikibase instance. I think it is possible to make queries on data on a particular instance only with external tools at the moment, right?
Yes, this is correct. The SPARQL query support that we currently offer is obtained by making an RDF export and loading it into a SPARQL database (we use Virtuoso but you could also use BlazeGraph, for example; both have free and open source versions and are not hard to install overall; if your data is not so large, you could also try Jena; there are further open source RDF databases, but these are the most prominent right now I think).
The RDF export, too, is not currently generated by Wikibase. However, Wikidata Toolkit, which we use to make the RDF dumps, can be used with data from any Wikibase installation in theory. In practice, nobody has asked for this yet and we might have to make a few adjustments to really get it to work in a convenient way. For a start, I don't know what kind of export options a standalone Wikibase offers you at the moment. We can use the usual XML-based page dump if it contains valid JSON for a change (this was not the case for Wikidata last time I checked ...). Better yet would be the JSON exports, but I don't know if you can generate them with vanilla Wikibase or if WMF is using some special tools for this.
Anyway, it can't be too hard to make this work and once it is done you would have a query service that is at the same level as the one of Wikidata. You could even combine data from more than one Wiki in one RDF database, e.g., to run queries over data from both.
Regards,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 30.04.2015 11:25, Jean-Baptiste Pressac wrote:
Hello, Does this also means that the RDF data available via for instance http://www.wikidata.org/entity/Q235382.nt or http://www.wikidata.org/entity/Q235382 could not be queried via SPARQL unless you download the .nt file and use for instance Jena ARQ https://jena.apache.org/documentation/query/index.html on your own computer ?
Maybe not your computer, but somebody has to download the data in some way in order for it to be queried. Query answering is a complicated process that does not usually work on the fly, and somebody has to do the math in the end. There are some approaches to explore linked data in a query-like manner in real time, but it should be clear that this will always take much longer than if you have downloaded the data first.
Does this also means that there is no use to publish RDF data linking to Wikidata like for instance :
@prefix mydata: http://mydata.fr/ . @prefix cidoc: http://www.cidoc-crm.org/cidoc-crm/ . @prefix wikidata: http://www.wikidata.org/entity/
mydata:event/1 a cidoc:E67_Birth ; cidoc:P98_brought_into_life mydata:person/80 ; cidoc:P7_took_place_at wikidata:Q235382 ;
This is still a useful thing to do for several reasons. First of all, the link connects your data and clarifies its meaning. This is useful to consumers who find your data. Second, there are linked data crawlers that aggregate linked data from many sources to provide you with a query service. OpenLink is running one such service, and if they managed to find your data, you could use their service to issue queries.
Regards,
Markus
Thanks,
Jean-Baptiste Pressac
Traitement et analyse de bases de données Production et diffusion de corpus numériques
Centre de Recherche Bretonne et Celtique Unité mixte de service (UMS) 3554 20 rue Duquesne CS 93837 29238 Brest cedex 3
tel : +33 (0)2 98 01 68 95 fax : +33 (0)2 98 01 63 93
Le 29/04/2015 21:44, Markus Krötzsch a écrit :
On 29.04.2015 20:56, Luca Martinelli wrote:
Dear all,
I need to know about the possibility of making queries on a Wikibase instance. I think it is possible to make queries on data on a particular instance only with external tools at the moment, right?
Yes, this is correct. The SPARQL query support that we currently offer is obtained by making an RDF export and loading it into a SPARQL database (we use Virtuoso but you could also use BlazeGraph, for example; both have free and open source versions and are not hard to install overall; if your data is not so large, you could also try Jena; there are further open source RDF databases, but these are the most prominent right now I think).
The RDF export, too, is not currently generated by Wikibase. However, Wikidata Toolkit, which we use to make the RDF dumps, can be used with data from any Wikibase installation in theory. In practice, nobody has asked for this yet and we might have to make a few adjustments to really get it to work in a convenient way. For a start, I don't know what kind of export options a standalone Wikibase offers you at the moment. We can use the usual XML-based page dump if it contains valid JSON for a change (this was not the case for Wikidata last time I checked ...). Better yet would be the JSON exports, but I don't know if you can generate them with vanilla Wikibase or if WMF is using some special tools for this.
Anyway, it can't be too hard to make this work and once it is done you would have a query service that is at the same level as the one of Wikidata. You could even combine data from more than one Wiki in one RDF database, e.g., to run queries over data from both.
Regards,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Sorry, all the mails were lost in the mare magnum of my full inbox, and I forgot to thank you all for your answers. :)
L.
2015-04-30 12:07 GMT+02:00 Markus Krötzsch markus@semantic-mediawiki.org:
On 30.04.2015 11:25, Jean-Baptiste Pressac wrote:
Hello, Does this also means that the RDF data available via for instance http://www.wikidata.org/entity/Q235382.nt or http://www.wikidata.org/entity/Q235382 could not be queried via SPARQL unless you download the .nt file and use for instance Jena ARQ https://jena.apache.org/documentation/query/index.html on your own computer ?
Maybe not your computer, but somebody has to download the data in some way in order for it to be queried. Query answering is a complicated process that does not usually work on the fly, and somebody has to do the math in the end. There are some approaches to explore linked data in a query-like manner in real time, but it should be clear that this will always take much longer than if you have downloaded the data first.
Does this also means that there is no use to publish RDF data linking to Wikidata like for instance :
@prefix mydata: http://mydata.fr/ . @prefix cidoc: http://www.cidoc-crm.org/cidoc-crm/ . @prefix wikidata: http://www.wikidata.org/entity/
mydata:event/1 a cidoc:E67_Birth ; cidoc:P98_brought_into_life mydata:person/80 ; cidoc:P7_took_place_at wikidata:Q235382 ;
This is still a useful thing to do for several reasons. First of all, the link connects your data and clarifies its meaning. This is useful to consumers who find your data. Second, there are linked data crawlers that aggregate linked data from many sources to provide you with a query service. OpenLink is running one such service, and if they managed to find your data, you could use their service to issue queries.
Regards,
Markus
Thanks,
Jean-Baptiste Pressac
Traitement et analyse de bases de données Production et diffusion de corpus numériques
Centre de Recherche Bretonne et Celtique Unité mixte de service (UMS) 3554 20 rue Duquesne CS 93837 29238 Brest cedex 3
tel : +33 (0)2 98 01 68 95 fax : +33 (0)2 98 01 63 93
Le 29/04/2015 21:44, Markus Krötzsch a écrit :
On 29.04.2015 20:56, Luca Martinelli wrote:
Dear all,
I need to know about the possibility of making queries on a Wikibase instance. I think it is possible to make queries on data on a particular instance only with external tools at the moment, right?
Yes, this is correct. The SPARQL query support that we currently offer is obtained by making an RDF export and loading it into a SPARQL database (we use Virtuoso but you could also use BlazeGraph, for example; both have free and open source versions and are not hard to install overall; if your data is not so large, you could also try Jena; there are further open source RDF databases, but these are the most prominent right now I think).
The RDF export, too, is not currently generated by Wikibase. However, Wikidata Toolkit, which we use to make the RDF dumps, can be used with data from any Wikibase installation in theory. In practice, nobody has asked for this yet and we might have to make a few adjustments to really get it to work in a convenient way. For a start, I don't know what kind of export options a standalone Wikibase offers you at the moment. We can use the usual XML-based page dump if it contains valid JSON for a change (this was not the case for Wikidata last time I checked ...). Better yet would be the JSON exports, but I don't know if you can generate them with vanilla Wikibase or if WMF is using some special tools for this.
Anyway, it can't be too hard to make this work and once it is done you would have a query service that is at the same level as the one of Wikidata. You could even combine data from more than one Wiki in one RDF database, e.g., to run queries over data from both.
Regards,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l