Hi!
Oh that was very kind of you, thanks a lot. The format is indeed self-explanatory, it should not be a problem. However I was looking at the feature set and just to confirm: is each of these fields computed at query time with the provided query_string for each of the top 448 results after the first rescore? (in any case, this is what is suggested in the LTR plugin for elastic search docs).
If so, it would already be useful for me to have the actual mapping of a wikipedia page in elasticsearch (the definition of the fields “title” or “opening_text” are more or less evident, but not so much for “all_near_match” or “file_text.plain”). Is that available anywhere?
Thank you very much again and have a nice day!
Aitor
On 18 May 2022, at 23:17, ebernhardson@wikimedia.org wrote:
Hi!
These models have never been published, but not for any particular reason. I suppose no-one had ever asked about them. I copied the current models out of elasticsearch into https://people.wikimedia.org/~ebernhardson/cirrus_models.20220518/ if looking them over might help you. They are in the format the sltr plugin stores them, which seems useful as it includes both the feature definitions and the xgboost model in JSON.
Erik B. _______________________________________________ Wiki-research-l mailing list -- wiki-research-l@lists.wikimedia.org To unsubscribe send an email to wiki-research-l-leave@lists.wikimedia.org