We've had a lot of ideas floating around over the past week or two about what to do in the final weeks of the quarter towards tackling the zero results rate problem. This morning the engineering team had a 25 minute meeting to coalesce these ideas into a plan and sync up. We took notes in this etherpad: https://etherpad.wikimedia.org/p/nextupforsearch
The short summary of the meeting was a test which tries relaxing the AND operator for common terms in queries would be tried. This should improve natural language queries by reducing how important words like "the", "a", etc. are to the query, thus focussing in on the essence of the query. This also means that pages that don't contain these common terms, but only contain the core terms, could now be returned in results.
This work is tracked in the following series of tasks, the structure of which should now be very familiar to you all:
- T112178 https://phabricator.wikimedia.org/T112178: Relax 'AND' operator with the common term query - T112581 https://phabricator.wikimedia.org/T112581: Run A/B test on relaxing AND operator for search (test starting on 2015-09-22) - T112582 https://phabricator.wikimedia.org/T112582: Validate data for AND operator A/B test (on or after 2015-09-23) - T112583 https://phabricator.wikimedia.org/T112583: Analyse results of AND operator A/B test (on or after 2015-09-29)
What this does mean is that we've probably got a bunch of tests lined up to start at the same time. In principle this isn't a problem, but if the tests overlap it can cause difficulties. This will be discussed in tomorrow's analysis meeting.
As always, if there are any questions, let me know!
Thanks, Dan
wikimedia-search@lists.wikimedia.org