Hi!
Wikidata’s birthday is still a few days away but since there are no
deployments on Sundays we’ll get started with an early present ;-)
Wikidata and Search Platform teams are happy to announce that Wikidata
prefix search (aka wbsearchentities API aka the thing you use when you
type into that box on the top right or any time you edit an item or
property and use the selector widget) is now using new and improved
ElasticSearch backend. You should not see any changes except for
relevancy and ranking improvements.
Specifically improved are:
- better language support (matches along fallback chain and also can
match in any language, with lower score)
- flexibility - we now can use Elasticsearch rescore profiles which can
be tuned to take advantage of any fields we index for both matching and
boosting, including links counts, statement counts, label counts, (some)
statement values, etc. etc. More improvement coming soon in this area,
e.g. scoring disambig pages lower, scoring units higher in proper
context, etc.
- optimization - we do not need to store all search data in both DB
tables and Elastic indexes anymore, all the data that is needed for
search and retrieval of the results is stored in Elastic index and
retrieved in a single query.
- maintainability - since it is now part of the general Wikimedia search
ecosystem, it can be maintained together with the rest of the search
mechanisms, using the same infrastructure, monitoring, etc.
Please tell us if you have any suggestions, comments or experience any
problems with it.
--
Stas Malyshev
smalyshev(a)wikimedia.org