Hi everyone!
I have a question concerning the relevance search on wikipedia articles, and Robert West from EPFL pointed me to this mailing list as the best chance to answer it. I have been checking the elasticsearch query performed by the wikipedia api when it runs a basic search on the articles. More precisely, I am talking of the following api call:
https://en.wikipedia.org/w/api.php?action=query&list=search&format=j...
The actual elasticsearch query is available with the cirrusDumpQuery parameter:
https://en.wikipedia.org/w/api.php?action=query&list=search&format=j...
There are many things going on in that query, but my question is related with the rescoring of the results that gives the final score. In particular, with the clause
{ "sltr": { "model": "enwiki-20220421-20180215-query_explorer", "params": { "query_string": "architecture mathematics" } } }
I understand that the results are passed together with the keywords to a stored machine learning model whose name is enwiki-20220421-20180215-query_explorer. This, as far as I understand, is done using the LTR plugin for elasticsearch (https://github.com/o19s/elasticsearch-learning-to-rank). My question is the following: Is this model openly available anywhere? If so, could you point me where? If not, do you know why is it not openly available and yet used by Wikipedia?
I posted this as part of a question on stackoverflow some days ago. Please check https://stackoverflow.com/questions/72213203/elasticsearch-query-for-wikiped... for more context and some more related questions.
I thank you all in advance, have a nice day!
Aitor Pérez Machine Learning Engineer EPFL Graph - CEDE - EPFL aitor.perez@epfl.ch