What you really want then is a case insensitive regex, otherwise you will not match the first letter correctly. You also want to remove the Wikibase extension, as it might slow down the query by 0.7 sec. Takes 1.4 sec for me now.

  {?cmpnd wdt:P279 wd:Q11173 .} UNION
   {?cmpnd wdt:P31 wd:Q11173 .}
   ?cmpnd rdfs:label ?label .
   FILTER(regex(str(?label), "^a", "i"))


On Mon, Apr 25, 2016 at 12:16 PM, Herman Bergwerf <hermanbergwerf@gmail.com> wrote:
Hmm, with a contains filter this is perfect actually! I must have been doing something wrong because my query took about 20 seconds (I'm new to SPARQL, I think it was slow because I copied `?compound wdt:P31/wdt:P279* wd:Q11173` from somewhere).
An alternative I came up with is using https://www.wikidata.org/w/api.php?action=wbsearchentities&search=benzeen&language=$targetLanguage and filtering out the entries that have 'organic compound' in their description etc. but this is much cleaner.

Op ma 25 apr. 2016 om 19:53 schreef Egon Willighagen <egon.willighagen@gmail.com>:
On Mon, Apr 25, 2016 at 7:23 PM, Sebastian Burgstaller
<sebastian.burgstaller@gmail.com> wrote:
> A way to achieve this could be to fetch all labels and aliases for all
> chemical compounds in one query and store them locally in your web
> application. This certainly is only feasible if the number of compounds does
> not get to big in Wikdiata. Currently, the query takes ~ 6 sec.

But the search time goes down when you have something to search on, it
seems... the following query takes <1.5s:

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

  {?cmpnd wdt:P279 wd:Q11173 .} UNION
  {?cmpnd wdt:P31 wd:Q11173 .}
  ?cmpnd rdfs:label ?label .
  FILTER (strstarts(?label, "a"))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }

BTW, like Magnus said... if you only want to find things with the
PubChem compound identifier, you could take that route:

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?cmpnd ?label ?pubchemid WHERE {
  ?cmpnd wdt:P662 ?pubchemid .
  ?cmpnd rdfs:label ?label .
  FILTER (strstarts(?label, "a"))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }

But I am not sure that is a lot faster...

Also keep in mind that it seems to do a reasonable job at caching
search results...


