Really impressive!

On Fri, Mar 20, 2015 at 6:09 PM Markus Kroetzsch <markus.kroetzsch@tu-dresden.de> wrote:
Dear all,

Thanks to the people at the Center of Semantic Web Research in Chile
[1], we have a very first public SPARQL endpoint for Wikidata running.
This is very preliminary, so do not rely on it in applications and
expect things to fail, but you may still enjoy some things.

http://milenio.dcc.uchile.cl/sparql

The endpoint has all the data from our current RDF exports in one big
database [2]. Below this email are some example queries to get you
started (this is a bit of a learning-by-doing crash course in SPARQL
too, but you may want to consult a tutorial if you don't know it ;-).

There are some known bugs in the RDF that we will hopefully fix soon
[3]. Also, the service uses a dump that is already a few weeks old now.
We are more interested in testing functions right now before going
production. Also, this is a raw API interface, not a proposal for a nice UI.

Feedback (and other interesting queries) are welcome :-)

Cheers,

Markus


[1] http://ciws.cl/ -- a joint team from University of Chile and
Pontificia Universidad Catolica de Chile
[2] http://tools.wmflabs.org/wikidata-exports/rdf/
[3]
https://github.com/Wikidata/Wikidata-Toolkit/issues?q=is%3Aopen+is%3Aissue+label%3A%22RDF+export%22


==Lighthouses (Q39715) with their English label (LIMIT 100 for demo)==

PREFIX : <http://www.wikidata.org/entity/>
SELECT *
WHERE {
  ?lighthouse a :Q39715 .
  ?lighthouse rdfs:label ?label FILTER(LANG(?label) = "en")
} LIMIT 100

(Just paste the query into the box at http://milenio.dcc.uchile.cl/sparql)

The actual query condition is in the WHERE {...} part. Things starting
with ? are variables. Basic conditions take the form of triples:
"subject property value". For example, "?lighthouse a :Q39715" looks for
things that are a lighthouse ("a" is short for "rdf:type" which we use
to encode P31 statements without qualifiers). The dot "." is used as a
separator between triples.

Note that the label output is a bit cumbersome because you want to
filter by language (without the FILTER you get all labels in all
languages). A future UI would better fetch the labels after the query,
similar to WDQ, to get smaller & faster queries.


==People born in the same place that they died in==

PREFIX : <http://www.wikidata.org/entity/>
SELECT ?person ?personname ?placename
WHERE {
  ?person a :Q5 .
  ?person :P19c ?place .
  ?person :P20c ?place .
  ?person rdfs:label ?personname FILTER(LANG(?personname) = "en") .
  ?place rdfs:label ?placename FILTER(LANG(?placename) = "en")
}  LIMIT 100

Here we use a few actual Wikidata properties. Properties in their simple
form (Entity->Value) use ids with a "c" in the end, like :P19c here.
Only qualifier-free statements will be available in this form right now.
Note that we use the variable ?place in two places as a value. This is
how we query for things that have the same place in both cases.


==People who have Wikipedia (Q52) accounts==

PREFIX : <http://www.wikidata.org/entity/>
SELECT ?person ?personname ?username
WHERE {
   ?person :P553s ?statement .
   ?statement :P553v :Q52 .
   ?statement :P554q ?username .
   ?person rdfs:label ?personname FILTER(LANG(?personname) = "en") .
} LIMIT 100

This query needs to access qualifiers of a statement for "website
account on" (P553). To do this in RDF (and SPARQL), we access the
statement object instead of using simple property :P553c (which would
only give us the value). The statement is found through an "...s"
property; its value is found through a "...v" property; its qualifiers
are found through "...q" properties. Check out the graph in our paper to
get the picture
(http://korrekt.org/page/Introducing_Wikidata_to_the_Linked_Data_Web).
There you can also find how references are accessed.


==Currently existing countries==

PREFIX : <http://www.wikidata.org/entity/>
SELECT ?country ?countryName
WHERE {
   ?country :P31s ?statement .
   ?statement :P31v :Q3624078 .
      FILTER NOT EXISTS { ?statement :P582q ?endDate }
   ?country rdfs:label ?countryName FILTER(lang(?countryName)="en")
}

Similar pattern as with the Wikipedia accounts, but now we check that a
certain qualifier (end time) does not exist. You could also find
currently married people in this way, etc.


==Descendants of Queen Victoria (Q9439) ==

PREFIX : <http://www.wikidata.org/entity/>
SELECT DISTINCT *
WHERE {
  :Q9439 ((^:P25c|^:P22c)+) ?person .
  ?person rdfs:label ?label
  FILTER(LANG(?label) = "en")
} LIMIT 1000

Here, ((^:P25c|^:P22c)+) is a regular expression; ^ is for changing the
direction of a property (has mother -> mother of ...); | is for "or", +
is for one or more repetitions.


==Currently existing countries, ordered by the number of their current
neighbours==

PREFIX : <http://www.wikidata.org/entity/>
SELECT ?countryName (COUNT (DISTINCT ?neighbour) AS ?neighbours)
WHERE {
   ?country :P31s ?statement .
   ?statement :P31v :Q3624078 .
      FILTER NOT EXISTS { ?statement :P582q ?endDate }
   ?country rdfs:label ?countryName FILTER(lang(?countryName)="en")

   OPTIONAL { ?country (:P47s/:P47v) ?neighbour .
              ?neighbour :P31s ?statement2 .
              ?statement2 :P31v :Q3624078 .
              FILTER NOT EXISTS { ?statement2 :P582q ?endDate2 }
   }
} ORDER BY DESC(?neighbours)

Just to give an example of a slightly more complex query ;-) Note how we
use the expression (:P47s/:P47v) rather than :P47c to access the value
of potentially qualified statements here (since qualified statements are
currently not converted to direct :P47c statements).


--
Markus Kroetzsch
Faculty of Computer Science
Technische Universität Dresden
+49 351 463 38486
http://korrekt.org/

_______________________________________________
Wikidata-l mailing list
Wikidata-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-l