Cool! I guess I had in my mind from previous and incomplete conversation
that the goal was a specific cutoff. Thanks for the clarification.
Trey Jones
Software Engineer, Discovery
Wikimedia Foundation
On Fri, Jan 22, 2016 at 4:05 PM, Mikhail Popov <mpopov(a)wikimedia.org> wrote:
That's actually our goal with quick surveys :P We
want to ask users for
their satisfaction with our search and then build a predictive model with
satisfaction as the response variable and dwell time + other data as the
predictor variables.
Right now we're stuck at the "get training data" step. Once that's
resolved, we can do precisely what you described :D Then we'll have a daily
estimate of user satisfaction (unobservable without direct user feedback)
using data we can observe (browsing behavior).
Thanks,
Mikhail
On Fri, Jan 22, 2016 at 11:19 AM, Trey Jones <tjones(a)wikimedia.org> wrote:
Yesterday in the quarterly review Dan mentioned
that our current user
satisfaction metric uses the somewhat arbitrary 10s dwell time cutoff for a
successful search, and that we want to use a survey to correlate
qualitative and quantitative values to pin down a better cutoff for our
users. I don't remember whether Dan mentioned it, or I was just rehashing
the notion on my own, but it may be difficult to pin down a specific cutoff.
A wild thought appears! Why do we have to pin down a specific cut off?
Why can't we have a probabilistic user satisfaction metric? (Other then
complexity and computational speed, which may be relevant.)
We have the ability to gather so much data that we could easily compute
something like this: 20% of users are satisfied when dwell time is <5s, 35%
for 5-10s, 75% for 10-60s, 98% for 1m-5m, 85% for 5m-20m, and 80% for >20m.
Determining the cutoffs might be tricky, and computation is more complex
than counting, but not ridiculously complicated, and potentially much more
accurate for large samples. Presenting the results is still easy: "54.7% of
our users are happy with their search results based on our dwell-time
model".
I tried to do a quick search for papers on this topic, but I didn't find
anything. I'm not familiar with the literature, so that may not mean much.
Okay, back to the TextCat mines....
—Trey
Trey Jones
Software Engineer, Discovery
Wikimedia Foundation
_______________________________________________
discovery mailing list
discovery(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/discovery
--
*Mikhail Popov* // Data Analyst, Discovery
<https://www.mediawiki.org/wiki/Wikimedia_Discovery>
https://wikimediafoundation.org/
*Imagine a world in which every single human being can freely share in the
**sum of all knowledge. That's our commitment.* Donate
<https://donate.wikimedia.org/>.
_______________________________________________
discovery mailing list
discovery(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/discovery