Hi!
> Where can I learn about the internals of this jewel? (which search
> engine, what metrics are used to rank items, and so on).
Thanks for your kind words. You can track it here:
https://phabricator.wikimedia.org/T125500
and associated tasks like this one:
https://phabricator.wikimedia.org/T178851
which contain links to the patches. The search runs on the same
ElasticSearch we use for search on other sites, but the prototype has
specific code to deal with Wikidata specific data structure and the fact
that it is, unlike most other Wikimedia sites, multilingual by design.
The rankings are hand-tuned now and kind of hard to read right now
(we're working on improving this), they are contained here:
https://phabricator.wikimedia.org/diffusion/EWBA/browse/master/repo/config/
and specific functions we're using here:
https://phabricator.wikimedia.org/diffusion/EWBA/browse/master/repo/config/ElasticSearchRescoreFunctions.php;4c6aa54e56c68ebd3543b23c88f52ae6f176a079$25
Basically it's a combination of match score (how well the string matches
the query), incoming link count, sitelink count and special boosts like
demoting the disambiguation pages.
--
Stas Malyshev
smalyshev@wikimedia.org
_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata