Sounds interesting to me! Presumably their re-ranked results would be returned to users and clicks and maybe other data would be logged for an A/B test.

The questions that immediately come to mind are:
- How much hand-holding would be needed to get their container set up? Given good specs on our side and decent tech capabilities on their side, the hand-holding could be very minimal.
- What do they get out of it? What kinds of data are they going to expect to collect, and if it is covered by NDA or other PII protections, does it limit their ability to publish related research?
- If it goes well, how do we integrate it? Are they going to be willing to make their core code open source?

I think these obstacles can all be readily overcome, but they are a few of the things we would have to think about as we enter such a collaboration.

The best outcome, though, would be great—new ideas and new techniques for us and better results for our users.

Trey Jones

Sr. Software Engineer, Search Platform
Wikimedia Foundation


On Fri, Oct 19, 2018 at 1:54 PM, Erik Bernhardson <ebernhardson@wikimedia.org> wrote:
While at the conference this week I met up with an professor and he had an interesting proposal. Essentially the idea would be they could build containers (following our OSS requirements) that we could deploy, and for some small percent of search traffic pass our top 100 results to their container for reranking. 

Certainly more details would have to be worked out, but does this seem reasonable? Overall I still think working directly with academics can be beneficial to our own work, although I'm not sure what our overhead would be.

_______________________________________________
Discovery mailing list
Discovery@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/discovery