Hi Dmitriy,
Yes, you are right. There are some countries that have a value for "dissolved or abolished" (P576) but no end date for "instance of" "sovereign state". This includes the "United Kingdom of the Netherlands". Probably these should get an end date for their P31, or we should use P576 in the query.
A list of all states that have no end date to their P31 sovereign state but that do have a P576 date:
PREFIX : http://www.wikidata.org/entity/ SELECT ?country ?countryName WHERE { ?country :P576c ?disolutionDate . ?country :P31s ?statement . ?statement :P31v :Q3624078 . FILTER NOT EXISTS { ?statement :P582q ?endDate } ?country rdfs:label ?countryName FILTER(lang(?countryName)="en") }
A list of all states that have an end date to their P31 sovereign state but that do not have a P576 date:
PREFIX : http://www.wikidata.org/entity/ SELECT ?country ?countryName WHERE { ?country :P31s ?statement . ?statement :P31v :Q3624078 . ?statement :P582q ?endDate . FILTER NOT EXISTS { ?country :P576c ?disolutionDate } ?country rdfs:label ?countryName FILTER(lang(?countryName)="en") }
Of course, this is all based on the dump we use. Some of these things might have been fixed already.
Cheers,
Markus
On 20.03.2015 20:04, Dmitriy Sintsov wrote:
Hi Markus, Is ?statement :P31v :Q3624078 . FILTER NOT EXISTS { ?statement :P582q ?endDate } really enough to filter off currently non-existing countries? Because I have such code in my Python bot: http://paste.debian.net/162319/ And even with so many filters, there is a bit strange "Kingdom of Netherlands" which duplicates "Netherlands" but having only few cities. Dmitriy
On Fri, Mar 20, 2015 at 9:08 PM, Markus Kroetzsch <markus.kroetzsch@tu-dresden.de mailto:markus.kroetzsch@tu-dresden.de> wrote:
Dear all, Thanks to the people at the Center of Semantic Web Research in Chile [1], we have a very first public SPARQL endpoint for Wikidata running. This is very preliminary, so do not rely on it in applications and expect things to fail, but you may still enjoy some things. http://milenio.dcc.uchile.cl/__sparql <http://milenio.dcc.uchile.cl/sparql> The endpoint has all the data from our current RDF exports in one big database [2]. Below this email are some example queries to get you started (this is a bit of a learning-by-doing crash course in SPARQL too, but you may want to consult a tutorial if you don't know it ;-). There are some known bugs in the RDF that we will hopefully fix soon [3]. Also, the service uses a dump that is already a few weeks old now. We are more interested in testing functions right now before going production. Also, this is a raw API interface, not a proposal for a nice UI. Feedback (and other interesting queries) are welcome :-) Cheers, Markus [1] http://ciws.cl/ -- a joint team from University of Chile and Pontificia Universidad Catolica de Chile [2] http://tools.wmflabs.org/__wikidata-exports/rdf/ <http://tools.wmflabs.org/wikidata-exports/rdf/> [3] https://github.com/Wikidata/__Wikidata-Toolkit/issues?q=is%__3Aopen+is%3Aissue+label%3A%__22RDF+export%22 <https://github.com/Wikidata/Wikidata-Toolkit/issues?q=is%3Aopen+is%3Aissue+label%3A%22RDF+export%22> ==Lighthouses (Q39715) with their English label (LIMIT 100 for demo)== PREFIX : <http://www.wikidata.org/__entity/ <http://www.wikidata.org/entity/>> SELECT * WHERE { ?lighthouse a :Q39715 . ?lighthouse rdfs:label ?label FILTER(LANG(?label) = "en") } LIMIT 100 (Just paste the query into the box at http://milenio.dcc.uchile.cl/__sparql <http://milenio.dcc.uchile.cl/sparql>) The actual query condition is in the WHERE {...} part. Things starting with ? are variables. Basic conditions take the form of triples: "subject property value". For example, "?lighthouse a :Q39715" looks for things that are a lighthouse ("a" is short for "rdf:type" which we use to encode P31 statements without qualifiers). The dot "." is used as a separator between triples. Note that the label output is a bit cumbersome because you want to filter by language (without the FILTER you get all labels in all languages). A future UI would better fetch the labels after the query, similar to WDQ, to get smaller & faster queries. ==People born in the same place that they died in== PREFIX : <http://www.wikidata.org/__entity/ <http://www.wikidata.org/entity/>> SELECT ?person ?personname ?placename WHERE { ?person a :Q5 . ?person :P19c ?place . ?person :P20c ?place . ?person rdfs:label ?personname FILTER(LANG(?personname) = "en") . ?place rdfs:label ?placename FILTER(LANG(?placename) = "en") } LIMIT 100 Here we use a few actual Wikidata properties. Properties in their simple form (Entity->Value) use ids with a "c" in the end, like :P19c here. Only qualifier-free statements will be available in this form right now. Note that we use the variable ?place in two places as a value. This is how we query for things that have the same place in both cases. ==People who have Wikipedia (Q52) accounts== PREFIX : <http://www.wikidata.org/__entity/ <http://www.wikidata.org/entity/>> SELECT ?person ?personname ?username WHERE { ?person :P553s ?statement . ?statement :P553v :Q52 . ?statement :P554q ?username . ?person rdfs:label ?personname FILTER(LANG(?personname) = "en") . } LIMIT 100 This query needs to access qualifiers of a statement for "website account on" (P553). To do this in RDF (and SPARQL), we access the statement object instead of using simple property :P553c (which would only give us the value). The statement is found through an "...s" property; its value is found through a "...v" property; its qualifiers are found through "...q" properties. Check out the graph in our paper to get the picture (http://korrekt.org/page/__Introducing_Wikidata_to_the___Linked_Data_Web <http://korrekt.org/page/Introducing_Wikidata_to_the_Linked_Data_Web>). There you can also find how references are accessed. ==Currently existing countries== PREFIX : <http://www.wikidata.org/__entity/ <http://www.wikidata.org/entity/>> SELECT ?country ?countryName WHERE { ?country :P31s ?statement . ?statement :P31v :Q3624078 . FILTER NOT EXISTS { ?statement :P582q ?endDate } ?country rdfs:label ?countryName FILTER(lang(?countryName)="en"__) } Similar pattern as with the Wikipedia accounts, but now we check that a certain qualifier (end time) does not exist. You could also find currently married people in this way, etc. ==Descendants of Queen Victoria (Q9439) == PREFIX : <http://www.wikidata.org/__entity/ <http://www.wikidata.org/entity/>> SELECT DISTINCT * WHERE { :Q9439 ((^:P25c|^:P22c)+) ?person . ?person rdfs:label ?label FILTER(LANG(?label) = "en") } LIMIT 1000 Here, ((^:P25c|^:P22c)+) is a regular expression; ^ is for changing the direction of a property (has mother -> mother of ...); | is for "or", + is for one or more repetitions. ==Currently existing countries, ordered by the number of their current neighbours== PREFIX : <http://www.wikidata.org/__entity/ <http://www.wikidata.org/entity/>> SELECT ?countryName (COUNT (DISTINCT ?neighbour) AS ?neighbours) WHERE { ?country :P31s ?statement . ?statement :P31v :Q3624078 . FILTER NOT EXISTS { ?statement :P582q ?endDate } ?country rdfs:label ?countryName FILTER(lang(?countryName)="en"__) OPTIONAL { ?country (:P47s/:P47v) ?neighbour . ?neighbour :P31s ?statement2 . ?statement2 :P31v :Q3624078 . FILTER NOT EXISTS { ?statement2 :P582q ?endDate2 } } } ORDER BY DESC(?neighbours) Just to give an example of a slightly more complex query ;-) Note how we use the expression (:P47s/:P47v) rather than :P47c to access the value of potentially qualified statements here (since qualified statements are currently not converted to direct :P47c statements). -- Markus Kroetzsch Faculty of Computer Science Technische Universität Dresden +49 351 463 38486 http://korrekt.org/ _________________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org> https://lists.wikimedia.org/__mailman/listinfo/wikidata-l <https://lists.wikimedia.org/mailman/listinfo/wikidata-l>
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l