Hi Stas,
I guess its using an older index from a few weeks ago ? Doesn't seem to have the latest properties that have landed, but that's ok if the ES index isn't current yet and your just experimenting and getting feedback.
http://wikidata-wdsearch.wmflabs.org/w/index.php?search=partition&title=...
Didn't see https://www.wikidata.org/wiki/Property:P4653
On Mon, Dec 18, 2017 at 1:20 PM Stas Malyshev smalyshev@wikimedia.org wrote:
Hi!
Where can I learn about the internals of this jewel? (which search engine, what metrics are used to rank items, and so on).
Thanks for your kind words. You can track it here:
https://phabricator.wikimedia.org/T125500
and associated tasks like this one: https://phabricator.wikimedia.org/T178851
which contain links to the patches. The search runs on the same ElasticSearch we use for search on other sites, but the prototype has specific code to deal with Wikidata specific data structure and the fact that it is, unlike most other Wikimedia sites, multilingual by design.
The rankings are hand-tuned now and kind of hard to read right now (we're working on improving this), they are contained here: https://phabricator.wikimedia.org/diffusion/EWBA/browse/master/repo/config/ and specific functions we're using here:
https://phabricator.wikimedia.org/diffusion/EWBA/browse/master/repo/config/E...
Basically it's a combination of match score (how well the string matches the query), incoming link count, sitelink count and special boosts like demoting the disambiguation pages. -- Stas Malyshev smalyshev@wikimedia.org
Wikidata mailing list Wikidata@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata