Hello,
I would like to better understand the difference in using list=search VS generator=search for full-text search.
I've read list=search relies on elastic search: which are the differences in indexing and differences in returned results between list=generator and generator=search ?
I also need to query the page_ID of returned articles: I can using a generator=search: page_IDs are related to returned pages (example in sandbox https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&prop=extracts|redirects&format=json&rdprop=pageid%7Ctitle&indexpageids=&generator=search&gsrsearch=dj%20tiesto )
But cannot do it with list=search: I tried: list=search + generator=allpages + indexpageids parameter.
The pageIDs in query['pageids'] *are not related* to the articles in the query['search'] list - it looks like generator is querying new stuff by itself, instead of taking the list in input.
Could you please help to write a query using list=search to fetch also pageIDs of returned pages?
My sandbox attempt is:
https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=searc...
Thank you!
On Thu, Jan 28, 2016 at 5:21 AM, Luigi Assom luigi.assom@gmail.com wrote:
Hello,
I would like to better understand the difference in using list=search VS generator=search for full-text search.
I've read list=search relies on elastic search: which are the differences in indexing and differences in returned results between list=generator and generator=search ?
Are you actually using generator=search? Below you state that you're using generator=allpages, which is obviously going to give you different results.
Try an example like https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=searc... instead.
Hi Brad,
I tried a query with : params = {'action':'query', *'generator':'search'*, 'gsrnamespace' : 0, 'gsrsearch' : keywords, 'gsrlimit' : 20 , 'prop' : 'pageimages|extracts', 'pilimit' : 'max', 'exintro' : '', 'explaintext' : '', 'exsentences' : 3, 'exlimit' : 'max', 'redirects' : '' }
I want to try a different query with list=search, but also fetch the page_IDs for the results.
I tried: action=query&*list=search* &format=json&srsearch=gene%20editing&srprop=snippet&*indexpageids*=& *generator=allpages*
but the ids of generators do not match the id of the list. https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=searc...
*How to fetch pageIds for results in list=search ?*
I would be happy also to fetch decorators (images) in one query, that is use only one generator to complete the list with pageId and images.
Finally, I'd like to understand the difference between list=search and generator=search : do they reflect a different indexing or architecture (e.g. time response and indexing done in elastic search VS lucene )?
On Thu, Jan 28, 2016 at 5:00 PM, Brad Jorsch (Anomie) <bjorsch@wikimedia.org
wrote:
On Thu, Jan 28, 2016 at 5:21 AM, Luigi Assom luigi.assom@gmail.com wrote:
Hello,
I would like to better understand the difference in using list=search VS generator=search for full-text search.
I've read list=search relies on elastic search: which are the differences in indexing and differences in returned results between list=generator and generator=search ?
Are you actually using generator=search? Below you state that you're using generator=allpages, which is obviously going to give you different results.
Try an example like https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=searc... https://en.wikipedia.org/wiki/Special:ApiSandbox#action=query&list=search&format=json&srsearch=dj%20tiesto&srprop=snippet%7Ctitlesnippet&indexpageids=&generator=search&gsrsearch=dj%20tiesto instead.
-- Brad Jorsch (Anomie) Senior Software Engineer Wikimedia Foundation
Mediawiki-api mailing list Mediawiki-api@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/mediawiki-api
mediawiki-api@lists.wikimedia.org