That sounds like a great set of ideas. Are we capturing these in
phabricator tickets?
(Another approach with 'and' statements would be something like:
If it has a question mark:
consider ANDs to be strings rather than operators
else;
Use AND as an operator.
If that produces zero results:
round-trip with AND as a string.
)
On 27 August 2015 at 07:30, David Causse <dcausse(a)wikimedia.org> wrote:
Yes you're right, reading and re-reading cirrus
config file I can't find
anything that could bring more results by just tweaking some config values
:(
Next step is to use interwiki searches to run queries written in another
language which is also a "big feature".
There's another feature we could work on after this one:
Review the default AND operator between words. This seems to be in line with
Moiz's survey results and "somewhat" related to the paper reviewed by Trey
:
Users ask questions not keywords, for example this query :
what's the connection between power laws and zipf law [1]
returns no result
but:
power laws zipf distribution [2]
returns good results
I think a first naive approach would be to review this default AND and try
something like: if there is more than X words allow Y% to match.
[1]
https://en.wikipedia.org/w/index.php?title=Special:Search&search=what%2…
[2]
https://en.wikipedia.org/w/index.php?title=Special%3ASearch&profile=def…
Le 27/08/2015 04:39, Oliver Keyes a écrit :
So I'm hearing we may have a contender for 'big changes to the ZRR' then ;).
This seems to reinforce the 'big features, not small config changes'
approach to the problem.
On 26 August 2015 at 19:34, Trey Jones <tjones(a)wikimedia.org> wrote:
And that's in line with the previous experiment. If you have a 32% zero
results rate, reducing it by 38% (32% * (1-.38)) gives 19.84%. So, allow a
little rounding error in the "32", "38" and "19", and this
is right on the
money.
—Trey
P.S.: 2 + 2 = 5, for very large values of 2.
Trey Jones
Software Engineer, Discovery
Wikimedia Foundation
On Wed, Aug 26, 2015 at 3:58 PM, Erik Bernhardson
<ebernhardson(a)wikimedia.org> wrote:
I ran some zero result rate tests against this API today, it is a huge
reduction in the zero result rate over the existing prefix search. from 32%
to 19% (on a 1% sample of prefix searches for an entire day)
On Wed, Aug 26, 2015 at 12:34 PM, Stas Malyshev <smalyshev(a)wikimedia.org>
wrote:
Hi!
I uploaded a small HTML page to compare both approaches:
http://cirrus-browser-bot.wmflabs.org/suggest.html
This is very cool! From my very short testing, seems that it works
pretty nicely.
--
Stas Malyshev
smalyshev(a)wikimedia.org
_______________________________________________
Wikimedia-search mailing list
Wikimedia-search(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimedia-search
_______________________________________________
Wikimedia-search mailing list
Wikimedia-search(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimedia-search
_______________________________________________
Wikimedia-search mailing list
Wikimedia-search(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimedia-search
_______________________________________________
Wikimedia-search mailing list
Wikimedia-search(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikimedia-search