Hi Tim,
Pywikibot has generators around the API. For example for search you have https://doc.wikimedia.org/pywikibot/master/api_ref/pywikibot.html#pywikibot.pagegenerators.SearchPageGenerator . So basically anything you can search for as a user can also be used as a generator in Pywikibot.
Say for example all bands that have "Bush" in their name. We have the band Bush at https://www.wikidata.org/wiki/Q247949 . With a bit of a trick you can see what the search engine knows about a page: https://www.wikidata.org/w/index.php?title=Q247949&action=cirrusdump . We can use this to limit the search engine to limit the results to only instance of (P31) band (Q215380), see https://www.wikidata.org/w/index.php?search=bush+-wbhasstatement%3A%22P31%3DQ215380%22&title=Special%3ASearch&profile=advanced&fulltext=1&advancedSearch-current=%7B%7D&ns0=1&ns120=1 or as API output at https://www.wikidata.org/w/api.php?action=query&list=search&srsearch=bush%20-wbhasstatement:%22P31=Q215380%22&format=json
Pywikibot accepts the same search string:
>>> import pywikibot
>>> from pywikibot import pagegenerators
>>> query = 'bush -wbhasstatement:"P31=Q215380"'
>>> repo = pywikibot.Site().data_repository()
>>> searchgen =
pagegenerators.SearchPageGenerator(query,site=repo)
>>> for item in searchgen:
... print (item.title())
...
Q1156378
Q16945866
Q16953971
Q247949
Q2928714
Q5001360
Q5001432
Q7720714
Q7757229
>>>
Maarten
Yes, the api is at https://www.wikidata.org/w/api.php?action=query&list=search&srsearch=Bush
There's a sandbox where you can play with the various options:
https://www.wikidata.org/wiki/Special:ApiSandbox#action=query&format=json&list=search&srsearch=Bush
On Tue, Jun 4, 2019 at 2:22 PM Tim Finin <finin@umbc.edu> wrote:
_______________________________________________What's the best way to search Wikidata for items whose name or alias matches a string? The search available via pywikibot seems to only find a match if the search string is a prefix of an item's name or alias, so searching for "Bush" does not return any of the the George Bush items. I don't want to use a SPARQL query with a regex, since I expect that to be slow.
The search box on the Wikidata pages is closer to what I want. Is there a good way to call this via an API?
Ideally, I'd like to be able to specify a language and also a set of types, but I can do that once I've identified candidates based on a simple match with a query string.
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata
_______________________________________________ Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata