The satisfaction metric is for all searches across all wikis. This includes
searches on wiktionarys.
I don't think putting new information, such as Portuguese results on
Spanish Wikipedia, is going to have nearly the same level of impact as
improving the existing search results. We have an incredible amount of
information within the individual wiki's that we fail to surface. The
existing scoring algorithms are naive and don't even begin to incorperate
the things the world has learned in the last 20 years. For example one of
the factors we use as an indicator of relevance is the number of incoming
links, but almost 20 years ago two PhD students from Stanford described a
significantly better way, called PageRank. I think we would be much better
off integrating the learning's of the last 20 years than naively just
throwing more data into the pile and hoping something better comes out.
Erik B
On Nov 5, 2015 11:28 PM, "Federico Leva (Nemo)" <nemowiki(a)gmail.com>
wrote:
Erik Bernhardson, 05/11/2015 22:56:
My concern is that our current user satisfaction
metric suggests 15% of
users are happy with the results they are getting. This is really bad. I
would prefer to see us focus on search relevance and improving the
scoring of what we already have before spending more focus on interwiki
search.
What expense? The code for interwiki search is ready, only
https://phabricator.wikimedia.org/T96881 needs fixing AFAIK. I agree that
inlining the interwiki results is a bit harder, maybe that can be done
later.
15 % is a low satisfaction (but is that en.wiki? en.wiki has lots of
garbage of course), sure. On the bright side, that means it's easy to
improve. To make up numbers: if even just 3 % of users search dictionary
definitions on Wikipedia, interwiki search could increase the pool of happy
users by 20 %. ;-)
Nemo
_______________________________________________
discovery mailing list
discovery(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/discovery