Hi!
statements (about 2.5M) and on the question if SPARQL
could list all
entries in Wikidata that do not have statements. I played a bit with
Technically, it could, but since it's so many of them, they might not
finish in time. The problem is that since there's no indexes on
something not existing, what probably happens is that the database would
go entity by entity trying to find one that doesn't have a statement,
and that is slow. I think there may be a bug with LIMIT implementation,
or maybe it's just indeed taking too long...
combinations of OPTIONAL and FILTER-BOUND and FILTER
NOT EXIST...
something like:
PREFIX wikibase: <http://wikiba.se/ontology#>
SELECT DISTINCT ?entry ?label ?statement WHERE {
?entry rdfs:label ?label . FILTER (lang(?label) = "en")
FILTER NOT EXISTS {
?statement ?prop ?entry ;
wikibase:rank ?rank .
}
} LIMIT 5
This query also seems a bit wrong since it looks for ?entry as object,
not subject.
But there was something else I noted... statements are
not typed...
that would probably kick in some index, rather than the above query,
and the documentation actually speaks about wikibase:Statement [1] but
if I search for anything rdf:type-d as such, then it finds nothing in
the SPARQL end point:
Right, please check out:
https://www.mediawiki.org/wiki/Wikibase/Indexing/RDF_Dump_Format#WDQS_data_…
wikibase:Statement is ommitted from the database for performance
reasons. You could still match statements by URL by converting them to
str() and then using substr() function, but that probably wouldn't help
much since there's a lot of statements so the filtering would not be
very selective.
--
Stas Malyshev
smalyshev(a)wikimedia.org