I'm using the following Neo4j Cypher query that I'd like to replace with a SPARQL query against Wikidata:
MATCH (a:Item), (b:Item) WHERE a.itemId IN ['Q2', 'Q405', 'Q525'] AND b.itemId IN ['Q2', 'Q405', 'Q525'] WITH a, b OPTIONAL MATCH (a)-[rel]-(b) RETURN a, b, collect(rel)
The objective is to return the relationships between a given set of items. Here's the SPARQL query that I'm trying, but it times out:
PREFIX rdfs: http://www.w3.org/2000/01/rdf-schema# PREFIX wikibase: http://wikiba.se/ontology# PREFIX entity: http://www.wikidata.org/entity/ PREFIX p: http://www.wikidata.org/prop/direct/ SELECT ?subjectUrl ?subjectLabel ?propUrl ?propLabel ?objectUrl ?objectLabel WHERE { hint:Query hint:optimizer 'None' . ?subjectUrl ?propUrl ?objectUrl . ?subjectUrl rdfs:label ?subjectLabel . ?objectUrl rdfs:label ?objectLabel .
FILTER (LANG(?subjectLabel) = 'en') . FILTER (LANG(?objectLabel) = 'en') .
?property ?ref ?propUrl . ?property a wikibase:Property . ?property rdfs:label ?propLabel
FILTER (lang(?propLabel) = 'en' ) . FILTER (?subjectUrl IN (entity:Q2, entity:Q405, entity:Q525)) . FILTER (?objectUrl IN (entity:Q2, entity:Q405, entity:Q525)) .
} LIMIT 200
Is it feasible have a SPARQL query with this objective return in a few seconds? If so, can you please give me some guidance on an approach?
Thanks, James Weaver
Hi!
I'm using the following Neo4j Cypher query that I'd like to replace with a SPARQL query against Wikidata:
MATCH (a:Item), (b:Item) WHERE a.itemId IN ['Q2', 'Q405', 'Q525'] AND b.itemId IN ['Q2', 'Q405', 'Q525'] WITH a, b OPTIONAL MATCH (a)-[rel]-(b) RETURN a, b, collect(rel)
This one seems to be working:
SELECT * WHERE { VALUES ?from {wd:Q2 wd:Q405 wd:Q525 } VALUES ?to {wd:Q2 wd:Q405 wd:Q525 } ?from ?rel ?to }
SELECT ?subjectUrl ?subjectLabel ?propUrl ?propLabel ?objectUrl ?objectLabel WHERE { hint:Query hint:optimizer 'None' . ?subjectUrl ?propUrl ?objectUrl . ?subjectUrl rdfs:label ?subjectLabel . ?objectUrl rdfs:label ?objectLabel . FILTER (LANG(?subjectLabel) = 'en') . FILTER (LANG(?objectLabel) = 'en') .
I don't think this is the best approach, as you basically told the engine to go through all labels in the system (that would be tens of millions) and you turned off optimizer so it can't even rearrange the clauses.