On 19 February 2016 at 17:13, Jon Katz <jkatz@wikimedia.org> wrote:

Thanks, Erik! This is very helpful. What do you mean by 'back testing'?

For search, there's a few different approaches for quantitative testing that are less difficult than A/B testing in terms of development overhead, data analysis and coordination. One of those is to replay real user queries against the index, but run the query with slightly different parameters from original. This is super cheap compared to an A/B test, but the downside is that it can only answer really deterministic (for lack of a better word) things, like how the parameters affect the zero-results rate or result ordering; since there's no user interaction with the replayed queries, you don't know what the clickthrough would've been, so it's hard to measure how satisfied the user would've been.

Hopefully that helps explain it.

Thanks,

Dan

Dan Garry

Lead Product Manager, Discovery

Wikimedia Foundation