What you really want then is a case insensitive regex, otherwise you will not match the first letter correctly. You also want to remove the Wikibase extension, as it might slow down the query by 0.7 sec. Takes 1.4 sec for me now.

SELECT DISTINCT ?cmpnd ?label WHERE {
  {?cmpnd wdt:P279 wd:Q11173 .} UNION
   {?cmpnd wdt:P31 wd:Q11173 .}
   ?cmpnd rdfs:label ?label .
   FILTER(regex(str(?label), "^a", "i"))
}

cheers,
Sebastian

On Mon, Apr 25, 2016 at 12:16 PM, Herman Bergwerf <hermanbergwerf@gmail.com> wrote:
Hmm, with a contains filter this is perfect actually! I must have been doing something wrong because my query took about 20 seconds (I'm new to SPARQL, I think it was slow because I copied `?compound wdt:P31/wdt:P279* wd:Q11173` from somewhere).
An alternative I came up with is using https://www.wikidata.org/w/api.php?action=wbsearchentities&search=benzeen&language=$targetLanguage and filtering out the entries that have 'organic compound' in their description etc. but this is much cleaner.

Op ma 25 apr. 2016 om 19:53 schreef Egon Willighagen <egon.willighagen@gmail.com>:
On Mon, Apr 25, 2016 at 7:23 PM, Sebastian Burgstaller
<sebastian.burgstaller@gmail.com> wrote:
> A way to achieve this could be to fetch all labels and aliases for all
> chemical compounds in one query and store them locally in your web
> application. This certainly is only feasible if the number of compounds does
> not get to big in Wikdiata. Currently, the query takes ~ 6 sec.

But the search time goes down when you have something to search on, it
seems... the following query takes <1.5s:

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?cmpnd ?label WHERE {
  {?cmpnd wdt:P279 wd:Q11173 .} UNION
  {?cmpnd wdt:P31 wd:Q11173 .}
  ?cmpnd rdfs:label ?label .
  FILTER (strstarts(?label, "a"))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

BTW, like Magnus said... if you only want to find things with the
PubChem compound identifier, you could take that route:

PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
PREFIX rdfs: <http://www.w3.org/2000/01/rdf-schema#>

SELECT DISTINCT ?cmpnd ?label ?pubchemid WHERE {
  ?cmpnd wdt:P662 ?pubchemid .
  ?cmpnd rdfs:label ?label .
  FILTER (strstarts(?label, "a"))
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en" }
}

But I am not sure that is a lot faster...

Also keep in mind that it seems to do a reasonable job at caching
search results...

Egon

--
E.L. Willighagen
Department of Bioinformatics - BiGCaT
Maastricht University (http://www.bigcat.unimaas.nl/)
Homepage: http://egonw.github.com/
LinkedIn: http://se.linkedin.com/in/egonw
Blog: http://chem-bla-ics.blogspot.com/
PubList: http://www.citeulike.org/user/egonw/tag/papers
ORCID: 0000-0001-7542-0286
ImpactStory: https://impactstory.org/EgonWillighagen

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata