On Mon, Apr 25, 2016 at 12:16 PM, Herman Bergwerf <hermanbergwerf@gmail.com> wrote:

Hmm, with a contains filter this is perfect actually! I must have been doing something wrong because my query took about 20 seconds (I'm new to SPARQL, I think it was slow because I copied `?compound wdt:P31/wdt:P279* wd:Q11173` from somewhere).
An alternative I came up with is using https://www.wikidata.org/w/api.php?action=wbsearchentities&search=benzeen&language=$targetLanguage and filtering out the entries that have 'organic compound' in their description etc. but this is much cleaner.

Op ma 25 apr. 2016 om 19:53 schreef Egon Willighagen <egon.willighagen@gmail.com>:
On Mon, Apr 25, 2016 at 7:23 PM, Sebastian Burgstaller
<sebastian.burgstaller@gmail.com> wrote:
> A way to achieve this could be to fetch all labels and aliases for all
> chemical compounds in one query and store them locally in your web
> application. This certainly is only feasible if the number of compounds does
> not get to big in Wikdiata. Currently, the query takes ~ 6 sec.

But the search time goes down when you have something to search on, it
seems... the following query takes <1.5s:

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?cmpnd ?label WHERE {
{?cmpnd wdt:P279 wd:Q11173 .} UNION
{?cmpnd wdt:P31 wd:Q11173 .}
?cmpnd rdfs:label ?label .
FILTER (strstarts(?label, "a"))
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

BTW, like Magnus said... if you only want to find things with the
PubChem compound identifier, you could take that route:

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?cmpnd ?label ?pubchemid WHERE {
?cmpnd wdt:P662 ?pubchemid .
?cmpnd rdfs:label ?label .
FILTER (strstarts(?label, "a"))
SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

But I am not sure that is a lot faster...

Also keep in mind that it seems to do a reasonable job at caching
search results...

Egon

--
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: 0000-0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata