On Tue, Jul 28, 2015 at 7:18 AM, David Causse <dcausse@wikimedia.org> wrote:
Le 28/07/2015 15:09, Trey Jones a écrit :
My first pass at data gathering at the end of the day yesterday was slightly skewed, but the general trend still hold... getting Nemo's suggested "insource:" queries into Lagotto will definitely cut down on the number of zero-results searches we get (and actually do what they intend), if the update gets pushed to their heavy users.

Beware that insource is the most expensive query and we allow only 20 insource queries to run concurrently (for all wikipedia sites). I'm not sure it's a good idea to expose this tool too widely.
There is several features like that, (I mean syntax that's not available in any other search engine with a large audience) :
- wildcard queries (*)
- insource
- fuzzy searches

While these features are very useful to "expert users" I think we should not rely on such syntax to decrease the zero result rate because it won't scale.


All of what David said.

insource: is a hack for two reasons:

1) We took away the previous default behavior when we embarked on Cirrus.
insource: was basically the old lsearchd behavior anyway.

2) I really really want us to replicate the indexes to labs (like we do databases)
so labs/tool users can free query them and come up with all kinds of cool toys.
I think there's a task for it...but I can't find it (need coffee).

-Chad