Le 27/08/2015 17:59, Kevin Smith a écrit :
On Thu, Aug 27, 2015 at 4:30 AM, David Causse <dcausse(a)wikimedia.org
<mailto:dcausse@wikimedia.org>> wrote:
There's another feature we could work on after this one:
Review the default AND operator between words. This seems to be in
line with Moiz's survey results and "somewhat" related to the
paper reviewed by Trey :
Users ask questions not keywords, for example this query :
what's the connection between power laws and zipf law [1]
returns no result
but:
power laws zipf distribution [2]
returns good results
Earlier, I suggested ignoring "filler" words, but we thought elastic
was already doing scoring adjustments that would have a similar
effect. Apparently not, because a search for:
connection between power laws zipf distribution
brings up what look like pretty reasonable results. Throwing away
"what's", "the", and "and" before running the search
would help a lot
(at least in this case).
Yes, the term that prevents to find the result is "what".
Elasticsearch will limit the effect of such words in the score but the
default AND will force all these words to be in the document.
We have also some troubles with "what's" vs "what is"... I'll
have a look.