Le 27/08/2015 17:59, Kevin Smith a écrit :

On Thu, Aug 27, 2015 at 4:30 AM, David Causse <dcausse@wikimedia.org> wrote:
There's another feature we could work on after this one:
Review the default AND operator between words. This seems to be in line with Moiz's survey results and "somewhat" related to the paper reviewed by Trey :
Users ask questions not keywords, for example this query :
what's the connection between power laws and zipf law [1]
returns no result

but:
power laws zipf distribution [2]
returns good results


Earlier, I suggested ignoring "filler" words, but we thought elastic was already doing scoring adjustments that would have a similar effect. Apparently not, because a search for:

connection between power laws zipf distribution

brings up what look like pretty reasonable results. Throwing away "what's", "the", and "and" before running the search would help a lot (at least in this case).

Yes, the term that prevents to find the result is "what".
Elasticsearch will limit the effect of such words in the score but the default AND will force all these words to be in the document.

We have also some troubles with "what's" vs "what is"... I'll have a look.