Hi all,
For many years, Denny and I have been giving talks about why we need to improve the data management in Wikipedia. To explain and motivate this, we have often asked the simple question: "What are the world's largest cities with a female mayor?" The information to answer this is clearly in Wikipedia, but it would be painfully hard to get the result by reading articles.
I recently had the occasion of actually phrasing this in SPARQL, so that an answer can now, finally, be given. The query to run at
http://milenio.dcc.uchile.cl/sparql
is as follows (with some explaining comments inline):
PREFIX : http://www.wikidata.org/entity/ SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city ?city :P6s ?statement . # with a P6 (head of goverment) statement ?statement :P6v ?mayor . # ... that has the value ?mayor ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no P582 (end date) qualifier
# Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city :P1082s/:P1082v/http://www.wikidata.org/ontology#numericValue ?population .
# Optionally, find English labels for city and mayor: OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = "en" ) } } ORDER BY DESC(?population) LIMIT 100
To see the results, just paste this into the box at http://milenio.dcc.uchile.cl/sparql and press "Run query".
The query does not filter the most recent population but relies on Virtuoso to pick the biggest value for DESC sorting, and on the world to have (mostly) cities with increasing population numbers over time. This is also the reason why the population is not printed (it would give you more than one match per city then, even with DISTINCT). Picking the current population will become easier once ranks are used more widely to mark it.
There might also be some inaccuracies in cases where a past mayor does not have an "end date" set in Wikidata (Madrid has a suspiciously large number of current mayors ...), but a query can only ever be as good as its input data.
I hope this is inspiring to some of you. One could also look for the world's youngest or oldest current mayors with similar queries, for example.
Cheers,
Markus
This is seriously awesome! Thank you!
On Mon, Apr 20, 2015 at 1:18 PM Markus Krötzsch < markus@semantic-mediawiki.org> wrote:
Hi all,
For many years, Denny and I have been giving talks about why we need to improve the data management in Wikipedia. To explain and motivate this, we have often asked the simple question: "What are the world's largest cities with a female mayor?" The information to answer this is clearly in Wikipedia, but it would be painfully hard to get the result by reading articles.
I recently had the occasion of actually phrasing this in SPARQL, so that an answer can now, finally, be given. The query to run at
http://milenio.dcc.uchile.cl/sparql
is as follows (with some explaining comments inline):
PREFIX : http://www.wikidata.org/entity/ SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city ?city :P6s ?statement . # with a P6 (head of goverment) statement ?statement :P6v ?mayor . # ... that has the value ?mayor ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no P582 (end date) qualifier
# Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city :P1082s/:P1082v/http://www.wikidata.org/ontology#numericValue ?population .
# Optionally, find English labels for city and mayor: OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = "en" ) } } ORDER BY DESC(?population) LIMIT 100
To see the results, just paste this into the box at http://milenio.dcc.uchile.cl/sparql and press "Run query".
The query does not filter the most recent population but relies on Virtuoso to pick the biggest value for DESC sorting, and on the world to have (mostly) cities with increasing population numbers over time. This is also the reason why the population is not printed (it would give you more than one match per city then, even with DISTINCT). Picking the current population will become easier once ranks are used more widely to mark it.
There might also be some inaccuracies in cases where a past mayor does not have an "end date" set in Wikidata (Madrid has a suspiciously large number of current mayors ...), but a query can only ever be as good as its input data.
I hope this is inspiring to some of you. One could also look for the world's youngest or oldest current mayors with similar queries, for example.
Cheers,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 20.04.2015 22:21, Denny Vrandečić wrote:
This is seriously awesome! Thank you!
My pleasure. :-)
And here, as a bonus, the list of countries ordered by the number of their cities with female mayor (includes only countries with at least one such city):
PREFIX : http://www.wikidata.org/entity/ SELECT ?country ?label (count(*) as ?count) WHERE { ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city ?city :P6s ?statement . # with a P6 (head of goverment) statement ?statement :P6v ?mayor . # ... that has the value ?mayor ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no P582 (end date) qualifier ?city :P17s/:P17v ?country # Also find the country of the city
# If available, get the en label of the country: OPTIONAL { ?country rdfs:label ?label . FILTER ( LANG(?label) = "en" ) } } GROUP BY ?country ?label ORDER BY DESC(?count)
There seems to be a great imbalance here, which could indicate some bias/incompleteness of our data -- or, possibly, of the world.
Cheers,
Markus
On Mon, Apr 20, 2015 at 1:18 PM Markus Krötzsch <markus@semantic-mediawiki.org mailto:markus@semantic-mediawiki.org> wrote:
Hi all, For many years, Denny and I have been giving talks about why we need to improve the data management in Wikipedia. To explain and motivate this, we have often asked the simple question: "What are the world's largest cities with a female mayor?" The information to answer this is clearly in Wikipedia, but it would be painfully hard to get the result by reading articles. I recently had the occasion of actually phrasing this in SPARQL, so that an answer can now, finally, be given. The query to run at http://milenio.dcc.uchile.cl/sparql is as follows (with some explaining comments inline): PREFIX : <http://www.wikidata.org/entity/> SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city ?city :P6s ?statement . # with a P6 (head of goverment) statement ?statement :P6v ?mayor . # ... that has the value ?mayor ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no P582 (end date) qualifier # Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city :P1082s/:P1082v/<http://www.wikidata.org/ontology#numericValue> ?population . # Optionally, find English labels for city and mayor: OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = "en" ) } } ORDER BY DESC(?population) LIMIT 100 To see the results, just paste this into the box at http://milenio.dcc.uchile.cl/sparql and press "Run query". The query does not filter the most recent population but relies on Virtuoso to pick the biggest value for DESC sorting, and on the world to have (mostly) cities with increasing population numbers over time. This is also the reason why the population is not printed (it would give you more than one match per city then, even with DISTINCT). Picking the current population will become easier once ranks are used more widely to mark it. There might also be some inaccuracies in cases where a past mayor does not have an "end date" set in Wikidata (Madrid has a suspiciously large number of current mayors ...), but a query can only ever be as good as its input data. I hope this is inspiring to some of you. One could also look for the world's youngest or oldest current mayors with similar queries, for example. Cheers, Markus _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 20.04.2015 22:29, Nicola Vitucci wrote:
...
I hope this is inspiring to some of you. One could also look for the world's youngest or oldest current mayors with similar queries, for example.
Markus, this is really cool! Can I reuse it as an example on WikiSPARQL? :-)
Yes, of course.
Markus
Hi,
On Mon, Apr 20, 2015 at 8:29 PM, Nicola Vitucci nicola.vitucci@gmail.com wrote:
Markus, this is really cool! Can I reuse it as an example on WikiSPARQL? :-)
What's the difference between http://milenio.dcc.uchile.cl/sparql and WikiSPARQL?
Just a different codebase/engine? should they be expected to provide essentially the same results for the same query?
-Jeremy
Il 23/04/2015 20:09, Jeremy Baron ha scritto:
Hi,
On Mon, Apr 20, 2015 at 8:29 PM, Nicola Vitucci nicola.vitucci@gmail.com wrote:
Markus, this is really cool! Can I reuse it as an example on WikiSPARQL? :-)
What's the difference between http://milenio.dcc.uchile.cl/sparql and WikiSPARQL?
Just a different codebase/engine? should they be expected to provide essentially the same results for the same query?
-Jeremy
Hi Jeremy,
WikiSPARQL is an experiment I started some time ago to gain some insights both in the data themselves and in the backend architecture. It is meant as a learning platform for myself and the Wikidata community and all the results I get will be shared. As for the query results they should be essentially the same (I say "essentially" because the loaded dumps might be out of sync), and today I started working on a way to actually use the same interface with other endpoints.
Nicola
On 23.04.2015 20:09, Jeremy Baron wrote:
Hi,
On Mon, Apr 20, 2015 at 8:29 PM, Nicola Vitucci nicola.vitucci@gmail.com wrote:
Markus, this is really cool! Can I reuse it as an example on WikiSPARQL? :-)
What's the difference between http://milenio.dcc.uchile.cl/sparql and WikiSPARQL?
Just a different codebase/engine? should they be expected to provide essentially the same results for the same query?
Yes, they are two services run by different groups of people using different software in the back, but using (so far) similar RDF dumps internally. Both should be considered experimental for now since details of the underlying RDF will still change.
We will also use query logs from the Chilean endpoint to help extract more test queries, see
https://www.mediawiki.org/wiki/Wikibase/Indexing/SPARQL_Query_Examples
(this is still under construction) These will then be useful as test cases and requirement specifications for the ongoing official WMF SPARQL endpoint development. The queries on the page are using a slightly different RDF model currently under discussion at WMF. Contributions (ideally with links to results) are welcome.
Cheers,
Markus
Hi!
is as follows (with some explaining comments inline):
This is very nice, thanks! Will use this as a test case for the query engine (btw yes it works on my test machine just fine :).
more than one match per city then, even with DISTINCT). Picking the current population will become easier once ranks are used more widely to mark it.
I think this is solved with preferred ranks and "truthy statements" concept pretty nice. So people should start using ranks to separate current data from historical data.
On 20.04.2015 22:51, Stas Malyshev wrote:
Hi!
is as follows (with some explaining comments inline):
This is very nice, thanks! Will use this as a test case for the query engine (btw yes it works on my test machine just fine :).
more than one match per city then, even with DISTINCT). Picking the current population will become easier once ranks are used more widely to mark it.
I think this is solved with preferred ranks and "truthy statements" concept pretty nice. So people should start using ranks to separate current data from historical data.
Exactly. It is almost impossible right now to get the most current population without using ranks (it's just too complex a concept for most query languages).
Markus
Something seems to be wrong with the order, though. Munich (pop > 1m in all statements) is listed way after Chemnitz (pop < 300k in all statements). Any idea why?
Oh... maybe quantity values are sorted in alphanumeric order, because they are decimal strings? They should be xsd:decimal...
Am 20.04.2015 um 22:18 schrieb Markus Krötzsch:
Hi all,
For many years, Denny and I have been giving talks about why we need to improve the data management in Wikipedia. To explain and motivate this, we have often asked the simple question: "What are the world's largest cities with a female mayor?" The information to answer this is clearly in Wikipedia, but it would be painfully hard to get the result by reading articles.
I recently had the occasion of actually phrasing this in SPARQL, so that an answer can now, finally, be given. The query to run at
http://milenio.dcc.uchile.cl/sparql
is as follows (with some explaining comments inline):
PREFIX : http://www.wikidata.org/entity/ SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city ?city :P6s ?statement . # with a P6 (head of goverment) statement ?statement :P6v ?mayor . # ... that has the value ?mayor ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no P582 (end date) qualifier
# Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city :P1082s/:P1082v/http://www.wikidata.org/ontology#numericValue ?population .
# Optionally, find English labels for city and mayor: OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = "en" ) } } ORDER BY DESC(?population) LIMIT 100
To see the results, just paste this into the box at http://milenio.dcc.uchile.cl/sparql and press "Run query".
The query does not filter the most recent population but relies on Virtuoso to pick the biggest value for DESC sorting, and on the world to have (mostly) cities with increasing population numbers over time. This is also the reason why the population is not printed (it would give you more than one match per city then, even with DISTINCT). Picking the current population will become easier once ranks are used more widely to mark it.
There might also be some inaccuracies in cases where a past mayor does not have an "end date" set in Wikidata (Madrid has a suspiciously large number of current mayors ...), but a query can only ever be as good as its input data.
I hope this is inspiring to some of you. One could also look for the world's youngest or oldest current mayors with similar queries, for example.
Cheers,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 20.04.2015 23:47, Daniel Kinzler wrote:
Something seems to be wrong with the order, though. Munich (pop > 1m in all statements) is listed way after Chemnitz (pop < 300k in all statements). Any idea why?
Good catch. My query was too simple (using one "random" population instead of the biggest one). Here is a better query, this time even with populations given:
PREFIX : http://www.wikidata.org/entity/ SELECT ?city (MAX(?population) AS ?max_population) ?citylabel ?mayorlabel WHERE { ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city ?city :P6s ?statement . # with a P6 (head of goverment) statement ?statement :P6v ?mayor . # ... that has the value ?mayor ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no P582 (end date) qualifier
# Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city :P1082s/:P1082v/http://www.wikidata.org/ontology#numericValue ?population .
# Optionally, find English labels for city and mayor: OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = "en" ) } } GROUP BY ?city ?citylabel ?mayorlabel ORDER BY DESC(?max_population) LIMIT 100
Oh... maybe quantity values are sorted in alphanumeric order, because they are decimal strings? They should be xsd:decimal...
They are.
Markus
Am 20.04.2015 um 22:18 schrieb Markus Krötzsch:
Hi all,
For many years, Denny and I have been giving talks about why we need to improve the data management in Wikipedia. To explain and motivate this, we have often asked the simple question: "What are the world's largest cities with a female mayor?" The information to answer this is clearly in Wikipedia, but it would be painfully hard to get the result by reading articles.
I recently had the occasion of actually phrasing this in SPARQL, so that an answer can now, finally, be given. The query to run at
http://milenio.dcc.uchile.cl/sparql
is as follows (with some explaining comments inline):
PREFIX : http://www.wikidata.org/entity/ SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city ?city :P6s ?statement . # with a P6 (head of goverment) statement ?statement :P6v ?mayor . # ... that has the value ?mayor ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no P582 (end date) qualifier
# Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city :P1082s/:P1082v/http://www.wikidata.org/ontology#numericValue ?population .
# Optionally, find English labels for city and mayor: OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = "en" ) } } ORDER BY DESC(?population) LIMIT 100
To see the results, just paste this into the box at http://milenio.dcc.uchile.cl/sparql and press "Run query".
The query does not filter the most recent population but relies on Virtuoso to pick the biggest value for DESC sorting, and on the world to have (mostly) cities with increasing population numbers over time. This is also the reason why the population is not printed (it would give you more than one match per city then, even with DISTINCT). Picking the current population will become easier once ranks are used more widely to mark it.
There might also be some inaccuracies in cases where a past mayor does not have an "end date" set in Wikidata (Madrid has a suspiciously large number of current mayors ...), but a query can only ever be as good as its input data.
I hope this is inspiring to some of you. One could also look for the world's youngest or oldest current mayors with similar queries, for example.
Cheers,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
This is super cool, thanks for sharing! Would you mind if I write it up for the Wikidata Query Service docs?
On Mon, Apr 20, 2015 at 3:50 PM, Markus Krötzsch < markus@semantic-mediawiki.org> wrote:
On 20.04.2015 23:47, Daniel Kinzler wrote:
Something seems to be wrong with the order, though. Munich (pop > 1m in all statements) is listed way after Chemnitz (pop < 300k in all statements). Any idea why?
Good catch. My query was too simple (using one "random" population instead of the biggest one). Here is a better query, this time even with populations given:
PREFIX : http://www.wikidata.org/entity/ SELECT ?city (MAX(?population) AS ?max_population) ?citylabel ?mayorlabel WHERE { ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city ?city :P6s ?statement . # with a P6 (head of goverment) statement ?statement :P6v ?mayor . # ... that has the value ?mayor ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no P582 (end date) qualifier
# Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city :P1082s/:P1082v/http://www.wikidata.org/ontology#numericValue ?population .
# Optionally, find English labels for city and mayor: OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = "en" ) } } GROUP BY ?city ?citylabel ?mayorlabel ORDER BY DESC(?max_population) LIMIT 100
Oh... maybe quantity values are sorted in alphanumeric order, because they are decimal strings? They should be xsd:decimal...
They are.
Markus
Am 20.04.2015 um 22:18 schrieb Markus Krötzsch:
Hi all,
For many years, Denny and I have been giving talks about why we need to improve the data management in Wikipedia. To explain and motivate this, we have often asked the simple question: "What are the world's largest cities with a female mayor?" The information to answer this is clearly in Wikipedia, but it would be painfully hard to get the result by reading articles.
I recently had the occasion of actually phrasing this in SPARQL, so that an answer can now, finally, be given. The query to run at
http://milenio.dcc.uchile.cl/sparql
is as follows (with some explaining comments inline):
PREFIX : http://www.wikidata.org/entity/ SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city ?city :P6s ?statement . # with a P6 (head of goverment) statement ?statement :P6v ?mayor . # ... that has the value ?mayor ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no P582 (end date) qualifier
# Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city :P1082s/:P1082v/http://www.wikidata.org/ontology#numericValue ?population .
# Optionally, find English labels for city and mayor: OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = "en" ) } } ORDER BY DESC(?population) LIMIT 100
To see the results, just paste this into the box at http://milenio.dcc.uchile.cl/sparql and press "Run query".
The query does not filter the most recent population but relies on Virtuoso to pick the biggest value for DESC sorting, and on the world to have (mostly) cities with increasing population numbers over time. This is also the reason why the population is not printed (it would give you more than one match per city then, even with DISTINCT). Picking the current population will become easier once ranks are used more widely to mark it.
There might also be some inaccuracies in cases where a past mayor does not have an "end date" set in Wikidata (Madrid has a suspiciously large number of current mayors ...), but a query can only ever be as good as its input data.
I hope this is inspiring to some of you. One could also look for the world's youngest or oldest current mayors with similar queries, for example.
Cheers,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 21.04.2015 02:05, James Douglas wrote:
This is super cool, thanks for sharing! Would you mind if I write it up for the Wikidata Query Service docs?
No, of course not. We could certainly use some more documentation. Be aware, however, that the RDF export format is still subject to change, so the query will have to change accordingly in the future.
Markus
On Mon, Apr 20, 2015 at 3:50 PM, Markus Krötzsch <markus@semantic-mediawiki.org mailto:markus@semantic-mediawiki.org> wrote:
On 20.04.2015 23:47, Daniel Kinzler wrote: Something seems to be wrong with the order, though. Munich (pop > 1m in all statements) is listed way after Chemnitz (pop < 300k in all statements). Any idea why? Good catch. My query was too simple (using one "random" population instead of the biggest one). Here is a better query, this time even with populations given: PREFIX : <http://www.wikidata.org/entity/> SELECT ?city (MAX(?population) AS ?max_population) ?citylabel ?mayorlabel WHERE { ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city ?city :P6s ?statement . # with a P6 (head of goverment) statement ?statement :P6v ?mayor . # ... that has the value ?mayor ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no P582 (end date) qualifier # Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city :P1082s/:P1082v/<http://www.wikidata.org/ontology#numericValue> ?population . # Optionally, find English labels for city and mayor: OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = "en" ) } } GROUP BY ?city ?citylabel ?mayorlabel ORDER BY DESC(?max_population) LIMIT 100 Oh... maybe quantity values are sorted in alphanumeric order, because they are decimal strings? They should be xsd:decimal... They are. Markus Am 20.04.2015 um 22:18 schrieb Markus Krötzsch: Hi all, For many years, Denny and I have been giving talks about why we need to improve the data management in Wikipedia. To explain and motivate this, we have often asked the simple question: "What are the world's largest cities with a female mayor?" The information to answer this is clearly in Wikipedia, but it would be painfully hard to get the result by reading articles. I recently had the occasion of actually phrasing this in SPARQL, so that an answer can now, finally, be given. The query to run at http://milenio.dcc.uchile.cl/sparql is as follows (with some explaining comments inline): PREFIX : <http://www.wikidata.org/entity/> SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city ?city :P6s ?statement . # with a P6 (head of goverment) statement ?statement :P6v ?mayor . # ... that has the value ?mayor ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no P582 (end date) qualifier # Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city :P1082s/:P1082v/<http://www.wikidata.org/ontology#numericValue> ?population . # Optionally, find English labels for city and mayor: OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = "en" ) } } ORDER BY DESC(?population) LIMIT 100 To see the results, just paste this into the box at http://milenio.dcc.uchile.cl/sparql and press "Run query". The query does not filter the most recent population but relies on Virtuoso to pick the biggest value for DESC sorting, and on the world to have (mostly) cities with increasing population numbers over time. This is also the reason why the population is not printed (it would give you more than one match per city then, even with DISTINCT). Picking the current population will become easier once ranks are used more widely to mark it. There might also be some inaccuracies in cases where a past mayor does not have an "end date" set in Wikidata (Madrid has a suspiciously large number of current mayors ...), but a query can only ever be as good as its input data. I hope this is inspiring to some of you. One could also look for the world's youngest or oldest current mayors with similar queries, for example. Cheers, Markus _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata-l _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Am 21.04.2015 um 00:50 schrieb Markus Krötzsch:
On 20.04.2015 23:47, Daniel Kinzler wrote:
Something seems to be wrong with the order, though. Munich (pop > 1m in all statements) is listed way after Chemnitz (pop < 300k in all statements). Any idea why?
Good catch. My query was too simple (using one "random" population instead of the biggest one). Here is a better query, this time even with populations given:
I still wonder how the old result came about, since the *all* population values for Munich are much bigger than *all* the population numbers for Chemnitz. Even with picking a random value, how could the order have been reversed?
On 21.04.2015 11:27, Daniel Kinzler wrote:
Am 21.04.2015 um 00:50 schrieb Markus Krötzsch:
On 20.04.2015 23:47, Daniel Kinzler wrote:
Something seems to be wrong with the order, though. Munich (pop > 1m in all statements) is listed way after Chemnitz (pop < 300k in all statements). Any idea why?
Good catch. My query was too simple (using one "random" population instead of the biggest one). Here is a better query, this time even with populations given:
I still wonder how the old result came about, since the *all* population values for Munich are much bigger than *all* the population numbers for Chemnitz. Even with picking a random value, how could the order have been reversed?
Good question. I don't know. Maybe there is some issue in Virtuoso here after all. However, the rest of the order looked sensible to me even in the old query. It could also be that our (non-live) data had a temporary glitch that has been fixed on Wikidata in the meantime; one should check the population values we get with SPARQL to be sure.
Cheers,
Markus
nice!
but I can't figure out why Paris (P90 https://www.wikidata.org/wiki/Q90) and Anne Hidalgo (Q2851133 https://www.wikidata.org/wiki/Q2851133) don't show up in the results given that:
Q90 P31: Q515 P6: Q2851133 (with no P582q)
Q2851133 P21: Q6581072
what could be wrong?
--
Maxime Lathuilière maxlath.eu http://maxlath.eu - @maxlath Inventaire https://inventaire.io - @inventaire_io wiki(pedia|data): Zorglub27 https://www.wikidata.org/wiki/User:Zorglub27
Le 21/04/2015 12:03, Markus Krötzsch a écrit :
On 21.04.2015 11:27, Daniel Kinzler wrote:
Am 21.04.2015 um 00:50 schrieb Markus Krötzsch:
On 20.04.2015 23:47, Daniel Kinzler wrote:
Something seems to be wrong with the order, though. Munich (pop > 1m in all statements) is listed way after Chemnitz (pop < 300k in all statements). Any idea why?
Good catch. My query was too simple (using one "random" population instead of the biggest one). Here is a better query, this time even with populations given:
I still wonder how the old result came about, since the *all* population values for Munich are much bigger than *all* the population numbers for Chemnitz. Even with picking a random value, how could the order have been reversed?
Good question. I don't know. Maybe there is some issue in Virtuoso here after all. However, the rest of the order looked sensible to me even in the old query. It could also be that our (non-live) data had a temporary glitch that has been fixed on Wikidata in the meantime; one should check the population values we get with SPARQL to be sure.
Cheers,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 21.04.2015 12:28, Maxime Lathuilière wrote:
nice!
but I can't figure out why Paris (P90 https://www.wikidata.org/wiki/Q90) and Anne Hidalgo (Q2851133 https://www.wikidata.org/wiki/Q2851133) don't show up in the results given that:
Q90 P31: Q515 P6: Q2851133 (with no P582q)
Q2851133 P21: Q6581072
what could be wrong?
Interesting. It seems that Paris has no population!
Markus
--
Maxime Lathuilière maxlath.eu http://maxlath.eu - @maxlath Inventaire https://inventaire.io - @inventaire_io wiki(pedia|data): Zorglub27 https://www.wikidata.org/wiki/User:Zorglub27
Le 21/04/2015 12:03, Markus Krötzsch a écrit :
On 21.04.2015 11:27, Daniel Kinzler wrote:
Am 21.04.2015 um 00:50 schrieb Markus Krötzsch:
On 20.04.2015 23:47, Daniel Kinzler wrote:
Something seems to be wrong with the order, though. Munich (pop > 1m in all statements) is listed way after Chemnitz (pop < 300k in all statements). Any idea why?
Good catch. My query was too simple (using one "random" population instead of the biggest one). Here is a better query, this time even with populations given:
I still wonder how the old result came about, since the *all* population values for Munich are much bigger than *all* the population numbers for Chemnitz. Even with picking a random value, how could the order have been reversed?
Good question. I don't know. Maybe there is some issue in Virtuoso here after all. However, the rest of the order looked sensible to me even in the old query. It could also be that our (non-live) data had a temporary glitch that has been fixed on Wikidata in the meantime; one should check the population values we get with SPARQL to be sure.
Cheers,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
indeed! I tried to import this statement from the French Wikipedia, but I guess this can't be taken in account before the next dump/update(?)
Maxime
Le 21/04/2015 13:22, Markus Krötzsch a écrit :
On 21.04.2015 12:28, Maxime Lathuilière wrote:
nice!
but I can't figure out why Paris (P90 https://www.wikidata.org/wiki/Q90) and Anne Hidalgo (Q2851133 https://www.wikidata.org/wiki/Q2851133) don't show up in the results given that:
Q90 P31: Q515 P6: Q2851133 (with no P582q)
Q2851133 P21: Q6581072
what could be wrong?
Interesting. It seems that Paris has no population!
Markus
--
Maxime Lathuilière maxlath.eu http://maxlath.eu - @maxlath Inventaire https://inventaire.io - @inventaire_io wiki(pedia|data): Zorglub27 https://www.wikidata.org/wiki/User:Zorglub27
Le 21/04/2015 12:03, Markus Krötzsch a écrit :
On 21.04.2015 11:27, Daniel Kinzler wrote:
Am 21.04.2015 um 00:50 schrieb Markus Krötzsch:
On 20.04.2015 23:47, Daniel Kinzler wrote:
Something seems to be wrong with the order, though. Munich (pop > 1m in all statements) is listed way after Chemnitz (pop < 300k in all statements). Any idea why?
Good catch. My query was too simple (using one "random" population instead of the biggest one). Here is a better query, this time even with populations given:
I still wonder how the old result came about, since the *all* population values for Munich are much bigger than *all* the population numbers for Chemnitz. Even with picking a random value, how could the order have been reversed?
Good question. I don't know. Maybe there is some issue in Virtuoso here after all. However, the rest of the order looked sensible to me even in the old query. It could also be that our (non-live) data had a temporary glitch that has been fixed on Wikidata in the meantime; one should check the population values we get with SPARQL to be sure.
Cheers,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hi, I put a link about this on the frwiki chat : https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:Le_Bistro/21_avril_2015#Liste_d...
Let's see if this can shake community a little bit :)
2015-04-21 13:22 GMT+02:00 Markus Krötzsch markus@semantic-mediawiki.org:
On 21.04.2015 12:28, Maxime Lathuilière wrote:
nice!
but I can't figure out why Paris (P90 https://www.wikidata.org/wiki/Q90) and Anne Hidalgo (Q2851133 https://www.wikidata.org/wiki/Q2851133) don't show up in the results given that:
Q90 P31: Q515 P6: Q2851133 (with no P582q)
Q2851133 P21: Q6581072
what could be wrong?
Interesting. It seems that Paris has no population!
Markus
--
Maxime Lathuilière maxlath.eu http://maxlath.eu - @maxlath Inventaire https://inventaire.io - @inventaire_io wiki(pedia|data): Zorglub27 <https://www.wikidata.org/wiki/User:Zorglub27
Le 21/04/2015 12:03, Markus Krötzsch a écrit :
On 21.04.2015 11:27, Daniel Kinzler wrote:
Am 21.04.2015 um 00:50 schrieb Markus Krötzsch:
On 20.04.2015 23:47, Daniel Kinzler wrote:
Something seems to be wrong with the order, though. Munich (pop > 1m in all statements) is listed way after Chemnitz (pop < 300k in all statements). Any idea why?
Good catch. My query was too simple (using one "random" population instead of the biggest one). Here is a better query, this time even with populations given:
I still wonder how the old result came about, since the *all* population values for Munich are much bigger than *all* the population numbers for Chemnitz. Even with picking a random value, how could the order have been reversed?
Good question. I don't know. Maybe there is some issue in Virtuoso here after all. However, the rest of the order looked sensible to me even in the old query. It could also be that our (non-live) data had a temporary glitch that has been fixed on Wikidata in the meantime; one should check the population values we get with SPARQL to be sure.
Cheers,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
You can get the live values from WDQ:
https://tools.wmflabs.org/wikidata-todo/tabernacle.html?wdq=claim%5B31%3A%28...
You'll have to sort them yourself, though...
On Tue, Apr 21, 2015 at 2:38 PM Thomas Douillard thomas.douillard@gmail.com wrote:
Hi, I put a link about this on the frwiki chat : https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:Le_Bistro/21_avril_2015#Liste_d...
Let's see if this can shake community a little bit :)
2015-04-21 13:22 GMT+02:00 Markus Krötzsch markus@semantic-mediawiki.org :
On 21.04.2015 12:28, Maxime Lathuilière wrote:
nice!
but I can't figure out why Paris (P90 https://www.wikidata.org/wiki/Q90) and Anne Hidalgo (Q2851133 https://www.wikidata.org/wiki/Q2851133) don't show up in the results given that:
Q90 P31: Q515 P6: Q2851133 (with no P582q)
Q2851133 P21: Q6581072
what could be wrong?
Interesting. It seems that Paris has no population!
Markus
--
Maxime Lathuilière maxlath.eu http://maxlath.eu - @maxlath Inventaire https://inventaire.io - @inventaire_io wiki(pedia|data): Zorglub27 < https://www.wikidata.org/wiki/User:Zorglub27%3E
Le 21/04/2015 12:03, Markus Krötzsch a écrit :
On 21.04.2015 11:27, Daniel Kinzler wrote:
Am 21.04.2015 um 00:50 schrieb Markus Krötzsch:
On 20.04.2015 23:47, Daniel Kinzler wrote:
> Something seems to be wrong with the order, though. Munich (pop > > 1m in all > statements) is listed way after Chemnitz (pop < 300k in all > statements). Any > idea why? >
Good catch. My query was too simple (using one "random" population instead of the biggest one). Here is a better query, this time even with populations given:
I still wonder how the old result came about, since the *all* population values for Munich are much bigger than *all* the population numbers for Chemnitz. Even with picking a random value, how could the order have been reversed?
Good question. I don't know. Maybe there is some issue in Virtuoso here after all. However, the rest of the order looked sensible to me even in the old query. It could also be that our (non-live) data had a temporary glitch that has been fixed on Wikidata in the meantime; one should check the population values we get with SPARQL to be sure.
Cheers,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Thanks, it made me realize the datas of my city are not up to date :) I thought : I wondered if I would see https://www.wikidata.org/entity/Q16037012 (although the city is not that big, but Rennes, a comparable on, showed up in the results, so ...) and it did not.
There is redundancy in this area: the ''head of goverment'' of cities is also present as a ''office heald'' : ''mayor of foo'', if there is an item ''mayor of foo'' (and this seems better than just ''office heald:mayor'').
Tom
2015-04-21 11:27 GMT+02:00 Daniel Kinzler daniel.kinzler@wikimedia.de:
Am 21.04.2015 um 00:50 schrieb Markus Krötzsch:
On 20.04.2015 23:47, Daniel Kinzler wrote:
Something seems to be wrong with the order, though. Munich (pop > 1m in
all
statements) is listed way after Chemnitz (pop < 300k in all
statements). Any
idea why?
Good catch. My query was too simple (using one "random" population
instead of
the biggest one). Here is a better query, this time even with
populations given:
I still wonder how the old result came about, since the *all* population values for Munich are much bigger than *all* the population numbers for Chemnitz. Even with picking a random value, how could the order have been reversed?
-- Daniel Kinzler Senior Software Developer
Wikimedia Deutschland Gesellschaft zur Förderung Freien Wissens e.V.
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 20 April 2015 at 21:18, Markus Krötzsch markus@semantic-mediawiki.org wrote:
(Madrid has a suspiciously large number of current mayors ...)
Not any more ;-)
On 20 April 2015 at 21:18, Markus Krötzsch markus@semantic-mediawiki.org wrote:
I recently had the occasion of actually phrasing this in SPARQL, so that an answer can now, finally, be given. The query to run at
http://milenio.dcc.uchile.cl/sparql
is as follows (with some explaining comments inline):
Please can you write this up a a blog post (or a Wikidata: page), so that we can link to it, tweet about it, etc?
We are also missing Houston. Go USA!
https://en.wikipedia.org/wiki/Annise_Parker
Thanks, Richard (User:Pharos)
On Tue, Apr 21, 2015 at 9:33 AM, Andy Mabbett andy@pigsonthewing.org.uk wrote:
On 20 April 2015 at 21:18, Markus Krötzsch markus@semantic-mediawiki.org wrote:
I recently had the occasion of actually phrasing this in SPARQL, so that
an
answer can now, finally, be given. The query to run at
http://milenio.dcc.uchile.cl/sparql
is as follows (with some explaining comments inline):
Please can you write this up a a blog post (or a Wikidata: page), so that we can link to it, tweet about it, etc?
-- Andy Mabbett @pigsonthewing http://pigsonthewing.org.uk
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 21.04.2015 15:33, Andy Mabbett wrote:
On 20 April 2015 at 21:18, Markus Krötzsch markus@semantic-mediawiki.org wrote:
I recently had the occasion of actually phrasing this in SPARQL, so that an answer can now, finally, be given. The query to run at
http://milenio.dcc.uchile.cl/sparql
is as follows (with some explaining comments inline):
Please can you write this up a a blog post (or a Wikidata: page), so that we can link to it, tweet about it, etc?
Sure, that's possible, but probably not this week. In fact, maybe we should fix the remaining issues first and update the data again. But it's surely a good idea to give a little SPARQL introduction using such an example.
Lydia, would this be something for the WMDE Blog?
Markus
Sure, that's possible, but probably not this week. In fact, maybe we should fix the remaining issues first and update the data again. But it's surely a good idea to give a little SPARQL introduction using such an example.
Lydia, would this be something for the WMDE Blog?
Markus
I'm going to write up a sort of tutorial with example queries too (including this one), so if needed I might help.
Nicola
There appear to be a number of other major cities missing, though I'm not sure what the cut-off for population is:
Cities where we have articles for the mayor:
Baltimore, Maryland (USA) Columbus, Georgia (USA) Diyarbakır (Turkey) Gary, Indiana (USA) Knoxville, Tennessee (USA) Łódź (Poland) Malmö (Sweden) Montevideo (Uruguay) Windhoek (Namibia)
Cities whose current mayors we don't have articles on yet:
Baghdad (Iraq) Prague (Czech Republic) Nantes (France)
(The above based on educated guesses from https://en.wikipedia.org/wiki/List_of_first_female_mayors#2010s )
Thanks, Richard (User:Pharos)
On Tue, Apr 21, 2015 at 10:17 AM, Nicola Vitucci nicola.vitucci@gmail.com wrote:
Sure, that's possible, but probably not this week. In fact, maybe we should fix the remaining issues first and update the data again. But it's surely a good idea to give a little SPARQL introduction using such an example.
Lydia, would this be something for the WMDE Blog?
Markus
I'm going to write up a sort of tutorial with example queries too (including this one), so if needed I might help.
Nicola
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Oh, and the city at the bottom of the query list has a population of only 279!
I have a feeling Wikidata is just playing better with Spanish cities for some reason :)
This is an awesome effort, but needs some work.
Thanks, Richard (User:Pharos)
On Tue, Apr 21, 2015 at 10:40 AM, Pharos pharosofalexandria@gmail.com wrote:
There appear to be a number of other major cities missing, though I'm not sure what the cut-off for population is:
Cities where we have articles for the mayor:
Baltimore, Maryland (USA) Columbus, Georgia (USA) Diyarbakır (Turkey) Gary, Indiana (USA) Knoxville, Tennessee (USA) Łódź (Poland) Malmö (Sweden) Montevideo (Uruguay) Windhoek (Namibia)
Cities whose current mayors we don't have articles on yet:
Baghdad (Iraq) Prague (Czech Republic) Nantes (France)
(The above based on educated guesses from https://en.wikipedia.org/wiki/List_of_first_female_mayors#2010s )
Thanks, Richard (User:Pharos)
On Tue, Apr 21, 2015 at 10:17 AM, Nicola Vitucci <nicola.vitucci@gmail.com
wrote:
Sure, that's possible, but probably not this week. In fact, maybe we should fix the remaining issues first and update the data again. But it's surely a good idea to give a little SPARQL introduction using such an example.
Lydia, would this be something for the WMDE Blog?
Markus
I'm going to write up a sort of tutorial with example queries too (including this one), so if needed I might help.
Nicola
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On Apr 21, 2015 10:41, "Pharos" pharosofalexandria@gmail.com wrote:
There appear to be a number of other major cities missing, though I'm not
sure what the cut-off for population is:
Cities where we have articles for the mayor:
Baltimore, Maryland (USA)
[...]
You're welcome to fix some of them. :)
https://www.wikidata.org/wiki/special:diff/208946888/212095823
-Jeremy
BTW, the Freebase ingestion later this summer should help fill a few of those holes in population and other statistics. We had US Census, World Bank, and UN Data as our primary data sources for any /statistics/ of a City/Town/Village. Here's Houston - https://www.freebase.com/m/03l2n#/location/statistical_region and Paris - https://www.freebase.com/m/05qtj#/location/statistical_region
The cut-off in the USA is based on Census data collection years.
Thad +ThadGuidry https://www.google.com/+ThadGuidry
On Tue, Apr 21, 2015 at 9:56 AM, Jeremy Baron jeremy@tuxmachine.com wrote:
On Apr 21, 2015 10:41, "Pharos" pharosofalexandria@gmail.com wrote:
There appear to be a number of other major cities missing, though I'm
not sure what the cut-off for population is:
Cities where we have articles for the mayor:
Baltimore, Maryland (USA)
[...]
You're welcome to fix some of them. :)
https://www.wikidata.org/wiki/special:diff/208946888/212095823
-Jeremy
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Hi,
On Tue, Apr 21, 2015 at 5:05 PM, Thad Guidry thadguidry@gmail.com wrote:
We had US Census, World Bank, and UN Data as our primary data sources for any /statistics/ of a City/Town/Village. Here's Houston - https://www.freebase.com/m/03l2n#/location/statistical_region
I don't understand where a lot of those numbers are from.
Also, maybe Houston is a bad example because the Census Bureau revised numbers after the data was released.[0] Even some official Census Bureau sites still report the old, pre-appeal number.[1]
There are multiple years that have duplicate conflicting values after clicking "65 values total »" at your link. At first I was thinking it may be something like estimates base vs. estimate vs. decennial. However, for 2010 and 2011 there's one value that matches estimate from [1] (source = [2]) and a larger value (source = [3]) that does not match any other data I've seen. [2] and [3] both use the same "Attribution URI" [4].
In any case, why take this from freebase instead of importing directly from Census Bureau data? It's available in bulk. Format isn't great but isn't horrible either. (at least the 5-year ACS is inconsistent about upper/lower case for state two letter abbreviations. and, I think, most humans would prefer something like a geoid as a key rather than a dataset specific key used to look up the geoid in a different file. and other quirks)
-Jeremy
[0] http://www.chron.com/news/houston-texas/houston/article/City-wins-census-app... [1] http://factfinder.census.gov/bkmk/table/1.0/en/PEP/2013/PEPANNRES/1620000US4... [2] https://www.freebase.com/g/11x1k306j [3] https://www.freebase.com/m/0jst35z [4] http://www.census.gov/popest/about/terms.html
Here's the (nearly) equivalent query for the statements dump[1] loaded into Blazegraph:
PREFIX wd: http://www.wikidata.org/entity/ SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city wd:P31s/wd:P31v wd:Q515 . # find instances of subclasses of city ?city wd:P6s ?statement . # with a P6 (head of goverment) statement ?statement wd:P6v ?mayor . # ... that has the value ?mayor ?mayor wd:P21s/wd:P21v wd:Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement wd:P582q ?x } # ... but the statement has no P582 (end date) qualifier
# Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city wd:P1082s/wd:P1082v/http://www.wikidata.org/ontology#numericValue ?population .
# Optionally, find English labels for city and mayor: OPTIONAL { ?city wd:P373s/wd:P373v ?citylabel . # FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor wd:P373s/wd:P373v ?mayorlabel . # FILTER ( LANG(?mayorlabel) = "en" ) }
} ORDER BY DESC(?population) LIMIT 100
Free beer to anyone who can figure out how to use those language filters. Would we need to also load property definitions[2]?
1. http://tools.wmflabs.org/wikidata-exports/rdf/exports/20150223/wikidata-stat... 2. http://tools.wmflabs.org/wikidata-exports/rdf/exports/20150223/wikidata-prop...
On Tue, Apr 21, 2015 at 11:13 AM, Jeremy Baron jeremy@tuxmachine.com wrote:
Hi,
On Tue, Apr 21, 2015 at 5:05 PM, Thad Guidry thadguidry@gmail.com wrote:
We had US Census, World Bank, and UN Data as our primary data sources
for any /statistics/ of a City/Town/Village. Here's Houston - https://www.freebase.com/m/03l2n#/location/statistical_region
I don't understand where a lot of those numbers are from.
Also, maybe Houston is a bad example because the Census Bureau revised numbers after the data was released.[0] Even some official Census Bureau sites still report the old, pre-appeal number.[1]
There are multiple years that have duplicate conflicting values after clicking "65 values total »" at your link. At first I was thinking it may be something like estimates base vs. estimate vs. decennial. However, for 2010 and 2011 there's one value that matches estimate from [1] (source = [2]) and a larger value (source = [3]) that does not match any other data I've seen. [2] and [3] both use the same "Attribution URI" [4].
In any case, why take this from freebase instead of importing directly from Census Bureau data? It's available in bulk. Format isn't great but isn't horrible either. (at least the 5-year ACS is inconsistent about upper/lower case for state two letter abbreviations. and, I think, most humans would prefer something like a geoid as a key rather than a dataset specific key used to look up the geoid in a different file. and other quirks)
-Jeremy
[0] http://www.chron.com/news/houston-texas/houston/article/City-wins-census-app... [1] http://factfinder.census.gov/bkmk/table/1.0/en/PEP/2013/PEPANNRES/1620000US4... [2] https://www.freebase.com/g/11x1k306j [3] https://www.freebase.com/m/0jst35z [4] http://www.census.gov/popest/about/terms.html
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On Tue, Apr 21, 2015 at 10:05 PM, James Douglas jdouglas@wikimedia.org wrote:
Here's the (nearly) equivalent query for the statements dump[1] loaded into Blazegraph:
better to work based on Markus's revised version. (on this thread, Date: Tue, 21 Apr 2015 00:50:26 +0200)
-Jeremy
Il 22/04/2015 00:05, James Douglas ha scritto:
Here's the (nearly) equivalent query for the statements dump[1] loaded into Blazegraph:
PREFIX wd: http://www.wikidata.org/entity/ SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city wd:P31s/wd:P31v wd:Q515 . # find instances of subclasses of city ?city wd:P6s ?statement . # with a P6 (head of goverment) statement ?statement wd:P6v ?mayor . # ... that has the value ?mayor ?mayor wd:P21s/wd:P21v wd:Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement wd:P582q ?x } # ... but the statement has no P582 (end date) qualifier
# Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city wd:P1082s/wd:P1082v/http://www.wikidata.org/ontology#numericValue ?population .
# Optionally, find English labels for city and mayor: OPTIONAL { ?city wd:P373s/wd:P373v ?citylabel . # FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor wd:P373s/wd:P373v ?mayorlabel . # FILTER ( LANG(?mayorlabel) = "en" ) }
} ORDER BY DESC(?population) LIMIT 100
Free beer to anyone who can figure out how to use those language filters. Would we need to also load property definitions[2]?
James,
I believe language filters are defined only on rdfs:label, not on the P373 property(-ies). Replace wd:P373s/wd:P373v with rdfs:label and it will work.
Nicola
P.S. Where is my beer? :-)
On 4/21/15 6:05 PM, James Douglas wrote:
Here's the (nearly) equivalent query for the statements dump[1] loaded into Blazegraph:
PREFIX wd: http://www.wikidata.org/entity/ SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city wd:P31s/wd:P31v wd:Q515 . # find instances of subclasses of city ?city wd:P6s ?statement . # with a P6 (head of goverment) statement ?statement wd:P6v ?mayor . # ... that has the value ?mayor ?mayor wd:P21s/wd:P21v wd:Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement wd:P582q ?x } # ... but the statement has no P582 (end date) qualifier
# Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city wd:P1082s/wd:P1082v/http://www.wikidata.org/ontology#numericValue ?population .
# Optionally, find English labels for city and mayor: OPTIONAL { ?city wd:P373s/wd:P373v ?citylabel . # FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor wd:P373s/wd:P373v ?mayorlabel . # FILTER ( LANG(?mayorlabel) = "en" ) }
} ORDER BY DESC(?population) LIMIT 100
Free beer to anyone who can figure out how to use those language filters. Would we need to also load property definitions[2]?
http://tools.wmflabs.org/wikidata-exports/rdf/exports/20150223/wikidata-stat... 2. http://tools.wmflabs.org/wikidata-exports/rdf/exports/20150223/wikidata-prop...
Please try to share SPARQL related examples using SPARQL Query Results URLs that identity documents where the content is dynamically generated via SPARQL query processing. It makes them easier to tweak and diagnose, amongst other things.
Examples:
1. http://bit.ly/wikidata-query-example-cities-with-female-mayor -- shortened query results url
BTW -- for language tags to work, the content would have to have also been language tagged using "xyz"@en prior to upload to DBMS. Then, modulo use of LANG filter, you would be seeing stuff like "Madrid"@en in the output produced by the SELECT LIST.
Kingsley
On Tue, Apr 21, 2015 at 11:13 AM, Jeremy Baron <jeremy@tuxmachine.com mailto:jeremy@tuxmachine.com> wrote:
Hi, On Tue, Apr 21, 2015 at 5:05 PM, Thad Guidry <thadguidry@gmail.com <mailto:thadguidry@gmail.com>> wrote: > We had US Census, World Bank, and UN Data as our primary data sources for any /statistics/ of a City/Town/Village. Here's Houston - https://www.freebase.com/m/03l2n#/location/statistical_region I don't understand where a lot of those numbers are from. Also, maybe Houston is a bad example because the Census Bureau revised numbers after the data was released.[0] Even some official Census Bureau sites still report the old, pre-appeal number.[1] There are multiple years that have duplicate conflicting values after clicking "65 values total »" at your link. At first I was thinking it may be something like estimates base vs. estimate vs. decennial. However, for 2010 and 2011 there's one value that matches estimate from [1] (source = [2]) and a larger value (source = [3]) that does not match any other data I've seen. [2] and [3] both use the same "Attribution URI" [4]. In any case, why take this from freebase instead of importing directly from Census Bureau data? It's available in bulk. Format isn't great but isn't horrible either. (at least the 5-year ACS is inconsistent about upper/lower case for state two letter abbreviations. and, I think, most humans would prefer something like a geoid as a key rather than a dataset specific key used to look up the geoid in a different file. and other quirks) -Jeremy [0] http://www.chron.com/news/houston-texas/houston/article/City-wins-census-appeal-count-adjusted-4087372.php [1] http://factfinder.census.gov/bkmk/table/1.0/en/PEP/2013/PEPANNRES/1620000US4835000 [2] https://www.freebase.com/g/11x1k306j [3] https://www.freebase.com/m/0jst35z [4] http://www.census.gov/popest/about/terms.html _______________________________________________ Wikidata-l mailing list Wikidata-l@lists.wikimedia.org <mailto:Wikidata-l@lists.wikimedia.org> https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On Tue, Apr 21, 2015 at 4:09 PM, Markus Krötzsch markus@semantic-mediawiki.org wrote:
Sure, that's possible, but probably not this week. In fact, maybe we should fix the remaining issues first and update the data again. But it's surely a good idea to give a little SPARQL introduction using such an example.
Lydia, would this be something for the WMDE Blog?
Markus, this is way cool! Good to see we have achieved a milestone ;-) If you want to write something up for the WMDE blog that's be great.
Cheers Lydia
Hoi, Do I understand correctly that you cannot have results showing labels for a given languages? Thanks, GerardM
On 20 April 2015 at 22:18, Markus Krötzsch markus@semantic-mediawiki.org wrote:
Hi all,
For many years, Denny and I have been giving talks about why we need to improve the data management in Wikipedia. To explain and motivate this, we have often asked the simple question: "What are the world's largest cities with a female mayor?" The information to answer this is clearly in Wikipedia, but it would be painfully hard to get the result by reading articles.
I recently had the occasion of actually phrasing this in SPARQL, so that an answer can now, finally, be given. The query to run at
http://milenio.dcc.uchile.cl/sparql
is as follows (with some explaining comments inline):
PREFIX : http://www.wikidata.org/entity/ SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city :P31c/:P279c* :Q515 . # find instances of subclasses of city ?city :P6s ?statement . # with a P6 (head of goverment) statement ?statement :P6v ?mayor . # ... that has the value ?mayor ?mayor :P21c :Q6581072 . # ... where the ?mayor has P21 (sex or gender) female FILTER NOT EXISTS { ?statement :P582q ?x } # ... but the statement has no P582 (end date) qualifier
# Now select the population value of the ?city # (the number is reached through a chain of three properties) ?city :P1082s/:P1082v/http://www.wikidata.org/ontology#numericValue ?population .
# Optionally, find English labels for city and mayor: OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = "en" ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = "en" ) } } ORDER BY DESC(?population) LIMIT 100
To see the results, just paste this into the box at http://milenio.dcc.uchile.cl/sparql and press "Run query".
The query does not filter the most recent population but relies on Virtuoso to pick the biggest value for DESC sorting, and on the world to have (mostly) cities with increasing population numbers over time. This is also the reason why the population is not printed (it would give you more than one match per city then, even with DISTINCT). Picking the current population will become easier once ranks are used more widely to mark it.
There might also be some inaccuracies in cases where a past mayor does not have an "end date" set in Wikidata (Madrid has a suspiciously large number of current mayors ...), but a query can only ever be as good as its input data.
I hope this is inspiring to some of you. One could also look for the world's youngest or oldest current mayors with similar queries, for example.
Cheers,
Markus
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
Il 23/04/2015 18:36, Gerard Meijssen ha scritto:
Hoi, Do I understand correctly that you cannot have results showing labels for a given languages? Thanks, GerardM
Hi Gerard,
what do you mean? Are you looking for something like this?
http://wikisparql.org/sparql?query=PREFIX+wd%3A+%3Chttp%3A%2F%2Fwww.wikidata...
(The chosen language is Spanish)
Nicola
Hoi, Yes the link below is for Russian :)
On 23 April 2015 at 19:02, Nicola Vitucci nicola.vitucci@gmail.com wrote:
Il 23/04/2015 18:36, Gerard Meijssen ha scritto:
Hoi, Do I understand correctly that you cannot have results showing labels for a given languages? Thanks, GerardM
Hi Gerard,
what do you mean? Are you looking for something like this?
http://wikisparql.org/sparql?query=PREFIX+wd%3A+%3Chttp%3A%2F%2Fwww.wikidata...
(The chosen language is Spanish)
Nicola
Wikidata-l mailing list Wikidata-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-l
On 4/23/15 1:02 PM, Nicola Vitucci wrote:
Il 23/04/2015 18:36, Gerard Meijssen ha scritto:
Hoi, Do I understand correctly that you cannot have results showing labels for a given languages? Thanks, GerardM
Hi Gerard,
what do you mean? Are you looking for something like this?
http://wikisparql.org/sparql?query=PREFIX+wd%3A+%3Chttp%3A%2F%2Fwww.wikidata...
(The chosen language is Spanish)
Nicola
Do you not have a URL parameter that resolves to the query source?
Example: http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&qtxt=PREFIX+wd:+%...
If not, please paste the query source in a reply.
On 4/23/15 2:01 PM, Kingsley Idehen wrote:
On 4/23/15 1:02 PM, Nicola Vitucci wrote:
Il 23/04/2015 18:36, Gerard Meijssen ha scritto:
Hoi, Do I understand correctly that you cannot have results showing labels for a given languages? Thanks, GerardM
Hi Gerard,
what do you mean? Are you looking for something like this?
http://wikisparql.org/sparql?query=PREFIX+wd%3A+%3Chttp%3A%2F%2Fwww.wikidata...
(The chosen language is Spanish)
Nicola
Do you not have a URL parameter that resolves to the query source?
Example: http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&qtxt=PREFIX+wd:+%...
If not, please paste the query source in a reply.
Here's what I mean:
Your Query, but via the SPARQL endpoint at: http://milenio.dcc.uchile.cl/sparql, which produces this SPARQL Query Results URL :
http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&query=%0D%0APREFI...
Query Definition URL: http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&qtxt=%0D%0APREFIX...
For the results using English labels (@en tag) you have query result:
http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&query=PREFIX+wd%3...
And query definition:
http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&qtxt=PREFIX+wd%3A...
Key thing here is that anyone is a click away from learning SPARQL by way of example, across various SPARQL endpoints :)
Il 23/04/2015 23:22, Kingsley Idehen ha scritto:
On 4/23/15 2:01 PM, Kingsley Idehen wrote:
On 4/23/15 1:02 PM, Nicola Vitucci wrote:
Il 23/04/2015 18:36, Gerard Meijssen ha scritto:
Hoi, Do I understand correctly that you cannot have results showing labels for a given languages? Thanks, GerardM
Hi Gerard,
what do you mean? Are you looking for something like this?
http://wikisparql.org/sparql?query=PREFIX+wd%3A+%3Chttp%3A%2F%2Fwww.wikidata...
(The chosen language is Spanish)
Nicola
Do you not have a URL parameter that resolves to the query source?
Example: http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&qtxt=PREFIX+wd:+%...
+and+mayor:%0D%0A++OPTIONAL+%7B%0D%0A++++%3Fcity+wd:P373s/wd:P373v+%3Fcitylabel+.%0D%0A++++%23+FILTER+(+LANG(%3Fcitylabel)+%3D+%22en%22+)%0D%0A++%7D%0D%0A++OPTIONAL+%7B%0D%0A++++%3Fmayor+wd:P373s/wd:P373v+%3Fmayorlabel+.%0D%0A++++%23+FILTER+(+LANG(%3Fmayorlabel)+%3D+%22en%22+)%0D%0A++%7D%0D%0A%0D%0A%7D+ORDER+BY+DESC(%3Fpopulation)+LIMIT+100&format=text/html&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on
If not, please paste the query source in a reply.
Here's what I mean:
Your Query, but via the SPARQL endpoint at: http://milenio.dcc.uchile.cl/sparql, which produces this SPARQL Query Results URL :
http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&query=%0D%0APREFI...
Query Definition URL: http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&qtxt=%0D%0APREFIX...
For the results using English labels (@en tag) you have query result:
http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&query=PREFIX+wd%3...
And query definition:
http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&qtxt=PREFIX+wd%3A...
Key thing here is that anyone is a click away from learning SPARQL by way of example, across various SPARQL endpoints :)
Aha, I see your point, good that you mentioned that. I'll work on it and, as I said in my reply to Jeremy, I'm working on a way to also let a user choose the endpoint and maybe compare the results across endpoints.
Nicola
On 4/23/15 5:39 PM, Nicola Vitucci wrote:
Il 23/04/2015 23:22, Kingsley Idehen ha scritto:
On 4/23/15 2:01 PM, Kingsley Idehen wrote:
On 4/23/15 1:02 PM, Nicola Vitucci wrote:
>Il 23/04/2015 18:36, Gerard Meijssen ha scritto: >>>> >Hoi, >>>> >Do I understand correctly that you cannot have results showing labels >>>> >for a given languages? >>>> >Thanks, >>>> > GerardM >>>> > >Hi Gerard, > >what do you mean? Are you looking for something like this? > >http://wikisparql.org/sparql?query=PREFIX+wd%3A+%3Chttp%3A%2F%2Fwww.wikidata... > > >(The chosen language is Spanish) > >Nicola
Do you not have a URL parameter that resolves to the query source?
Example: http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&qtxt=PREFIX+wd:+%...
+and+mayor:%0D%0A++OPTIONAL+%7B%0D%0A++++%3Fcity+wd:P373s/wd:P373v+%3Fcitylabel+.%0D%0A++++%23+FILTER+(+LANG(%3Fcitylabel)+%3D+%22en%22+)%0D%0A++%7D%0D%0A++OPTIONAL+%7B%0D%0A++++%3Fmayor+wd:P373s/wd:P373v+%3Fmayorlabel+.%0D%0A++++%23+FILTER+(+LANG(%3Fmayorlabel)+%3D+%22en%22+)%0D%0A++%7D%0D%0A%0D%0A%7D+ORDER+BY+DESC(%3Fpopulation)+LIMIT+100&format=text/html&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on
If not, please paste the query source in a reply.
Here's what I mean:
Your Query, but via the SPARQL endpoint at: http://milenio.dcc.uchile.cl/sparql, which produces this SPARQL Query Results URL :
http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&query=%0D%0APREFI...
Query Definition URL: http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&qtxt=%0D%0APREFIX...
For the results using English labels (@en tag) you have query result:
http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&query=PREFIX+wd%3...
And query definition:
http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&qtxt=PREFIX+wd%3A...
Key thing here is that anyone is a click away from learning SPARQL by way of example, across various SPARQL endpoints:)
Aha, I see your point, good that you mentioned that. I'll work on it and, as I said in my reply to Jeremy, I'm working on a way to also let a user choose the endpoint and maybe compare the results across endpoints.
Nicola
Excellent idea!
Il 23/04/2015 20:01, Kingsley Idehen ha scritto:
On 4/23/15 1:02 PM, Nicola Vitucci wrote:
Il 23/04/2015 18:36, Gerard Meijssen ha scritto:
Hoi, Do I understand correctly that you cannot have results showing labels for a given languages? Thanks, GerardM
Hi Gerard,
what do you mean? Are you looking for something like this?
http://wikisparql.org/sparql?query=PREFIX+wd%3A+%3Chttp%3A%2F%2Fwww.wikidata...
(The chosen language is Spanish)
Nicola
Do you not have a URL parameter that resolves to the query source?
Example: http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&qtxt=PREFIX+wd:+%...
and+mayor:%0D%0A++OPTIONAL+%7B%0D%0A++++%3Fcity+wd:P373s/wd:P373v+%3Fcitylabel+.%0D%0A++++%23+FILTER+(+LANG(%3Fcitylabel)+%3D+%22en%22+)%0D%0A++%7D%0D%0A++OPTIONAL+%7B%0D%0A++++%3Fmayor+wd:P373s/wd:P373v+%3Fmayorlabel+.%0D%0A++++%23+FILTER+(+LANG(%3Fmayorlabel)+%3D+%22en%22+)%0D%0A++%7D%0D%0A%0D%0A%7D+ORDER+BY+DESC(%3Fpopulation)+LIMIT+100&format=text/html&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on
If not, please paste the query source in a reply.
Hi Kingsley,
currently there is no such parameter, but it's a good idea - I'll add it.
The query source with the Spanish labels is below (replace "es" with "ru" for the Russian labels):
---------------------------------------------------
PREFIX wd: http://www.wikidata.org/entity/ SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city wd:P31s/wd:P31v wd:Q515 . ?city wd:P6s ?statement . ?statement wd:P6v ?mayor . ?mayor wd:P21s/wd:P21v wd:Q6581072 . FILTER NOT EXISTS { ?statement wd:P582q ?x } ?city wd:P1082s/wd:P1082v/http://www.wikidata.org/ontology#numericValue ?population . OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = "es" ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = "es" ) } } ORDER BY DESC(?population) LIMIT 100
---------------------------------------------------
Nicola
On 4/23/15 5:25 PM, Nicola Vitucci wrote:
Il 23/04/2015 20:01, Kingsley Idehen ha scritto:
On 4/23/15 1:02 PM, Nicola Vitucci wrote:
Il 23/04/2015 18:36, Gerard Meijssen ha scritto:
>> >Hoi, >> >Do I understand correctly that you cannot have results showing labels >> >for a given languages? >> >Thanks, >> > GerardM >> >
Hi Gerard,
what do you mean? Are you looking for something like this?
http://wikisparql.org/sparql?query=PREFIX+wd%3A+%3Chttp%3A%2F%2Fwww.wikidata...
(The chosen language is Spanish)
Nicola
Do you not have a URL parameter that resolves to the query source?
Example: http://milenio.dcc.uchile.cl/sparql?default-graph-uri=&qtxt=PREFIX+wd:+%...
and+mayor:%0D%0A++OPTIONAL+%7B%0D%0A++++%3Fcity+wd:P373s/wd:P373v+%3Fcitylabel+.%0D%0A++++%23+FILTER+(+LANG(%3Fcitylabel)+%3D+%22en%22+)%0D%0A++%7D%0D%0A++OPTIONAL+%7B%0D%0A++++%3Fmayor+wd:P373s/wd:P373v+%3Fmayorlabel+.%0D%0A++++%23+FILTER+(+LANG(%3Fmayorlabel)+%3D+%22en%22+)%0D%0A++%7D%0D%0A%0D%0A%7D+ORDER+BY+DESC(%3Fpopulation)+LIMIT+100&format=text/html&CXML_redir_for_subjs=121&CXML_redir_for_hrefs=&timeout=30000&debug=on
If not, please paste the query source in a reply.
Hi Kingsley,
currently there is no such parameter, but it's a good idea - I'll add it.
The query source with the Spanish labels is below (replace "es" with "ru" for the Russian labels):
PREFIX wd:http://www.wikidata.org/entity/ SELECT DISTINCT ?city ?citylabel ?mayorlabel WHERE { ?city wd:P31s/wd:P31v wd:Q515 . ?city wd:P6s ?statement . ?statement wd:P6v ?mayor . ?mayor wd:P21s/wd:P21v wd:Q6581072 . FILTER NOT EXISTS { ?statement wd:P582q ?x } ?city wd:P1082s/wd:P1082v/http://www.wikidata.org/ontology#numericValue ?population . OPTIONAL { ?city rdfs:label ?citylabel . FILTER ( LANG(?citylabel) = "es" ) } OPTIONAL { ?mayor rdfs:label ?mayorlabel . FILTER ( LANG(?mayorlabel) = "es" ) } } ORDER BY DESC(?population) LIMIT 100
Nicola
Yes, I know that. Our responses our a little out of sync. The link I shared to the query definition exposed what I wanted to share with others re. use of LANG() and the nature of SELECT list output etc..
Anyway, it would be great if you add the @qtx parameter as alternative to @query when the goal is identifying a document that's comprised of the SPARQL Query Definition rather than SPARQL Query Result (or solution).
This is good news!