Very nice write-up, Trey, my eyes didn't glaze over! :)
Product Manager, Discovery
On Tue, Jul 12, 2016 at 1:57 PM, Trey Jones <tjones(a)wikimedia.org> wrote:
Mikhail has written up and should soon release his report on our recent
TextCat A/B tests; the results look good, and language identification and
cross-wiki searching definitely improve the results (in terms of results
shown and results clicked) for otherwise poorly performing queries (those
that get fewer than 3 results).
Mikhail's report also suggests looking at some measure of confidence for
the language identification to see if that has any effect on the quality
(in terms of number of results, but more importantly clicks) of the
crosswiki (also "interwiki") results. This sounds like a good idea, but
TextCat doesn't make it super easy to do. I have some ideas, though, and I
would love some suggestions from anyone else who has any ideas.
The details are kind of technical, so if that kind of thing makes your
eyes glaze over, you should avert your gaze now.
Otherwise, check out my write up on TextCat and confidence
and share your ideas here, or on the talk page.
Software Engineer, Discovery
discovery mailing list