Hi!
I worry that with the "best possible query"
will be confusing or
undesirable because the ranking will be lousy, as you suggest. You can
True, but as other options do not solve the ranking problem either, we
at least get something stable and predictable there. I am not very clear
on what ranking on mixed searches means anyway, so maybe a lousy one is
ok as long as user request is "just find me something".
For the "garden of forking queries" (most
excellent poetic naming of
options!!), there is a straightforward though somewhat tedious way of
merging rankings. You can use an empirical distribution function
Thanks, it sounds like a good idea, but I assume to implement it we'll need:
1. Distribution profile for each of the query types, which will I assume
be highly specialized
2. Some code that actually does the score merging inside Elastic (since
in order to do pagination we need Elastic to do all the ranking)
And the ranking would still suck initially until we collected proper
distribution. This would also make getting this system set up pretty
non-trivial - after deploying the code, we'd need to collect the stats,
calculate the distribution, and then feed it back to the code - which
for an open-source component like Wikibase sounds a bit sub-optimal.
Still, it's an excellent information which at least gives us a
theoretical way forward here, even though requiring a lot of work.
--
Stas Malyshev
smalyshev(a)wikimedia.org