Yesterday in the quarterly review Dan mentioned that our current user satisfaction metric uses the somewhat arbitrary 10s dwell time cutoff for a successful search, and that we want to use a survey to correlate qualitative and quantitative values to pin down a better cutoff for our users. I don't remember whether Dan mentioned it, or I was just rehashing the notion on my own, but it may be difficult to pin down a specific cutoff.
A wild thought appears! Why do we have to pin down a specific cut off? Why can't we have a probabilistic user satisfaction metric? (Other then complexity and computational speed, which may be relevant.)
We have the ability to gather so much data that we could easily compute something like this: 20% of users are satisfied when dwell time is <5s, 35% for 5-10s, 75% for 10-60s, 98% for 1m-5m, 85% for 5m-20m, and 80% for >20m.
Determining the cutoffs might be tricky, and computation is more complex than counting, but not ridiculously complicated, and potentially much more accurate for large samples. Presenting the results is still easy: "54.7% of our users are happy with their search results based on our dwell-time model".
I tried to do a quick search for papers on this topic, but I didn't find anything. I'm not familiar with the literature, so that may not mean much.
Okay, back to the TextCat mines....
—Trey
Trey Jones Software Engineer, Discovery Wikimedia Foundation
That's actually our goal with quick surveys :P We want to ask users for their satisfaction with our search and then build a predictive model with satisfaction as the response variable and dwell time + other data as the predictor variables.
Right now we're stuck at the "get training data" step. Once that's resolved, we can do precisely what you described :D Then we'll have a daily estimate of user satisfaction (unobservable without direct user feedback) using data we can observe (browsing behavior).
Thanks, Mikhail
On Fri, Jan 22, 2016 at 11:19 AM, Trey Jones tjones@wikimedia.org wrote:
Yesterday in the quarterly review Dan mentioned that our current user satisfaction metric uses the somewhat arbitrary 10s dwell time cutoff for a successful search, and that we want to use a survey to correlate qualitative and quantitative values to pin down a better cutoff for our users. I don't remember whether Dan mentioned it, or I was just rehashing the notion on my own, but it may be difficult to pin down a specific cutoff.
A wild thought appears! Why do we have to pin down a specific cut off? Why can't we have a probabilistic user satisfaction metric? (Other then complexity and computational speed, which may be relevant.)
We have the ability to gather so much data that we could easily compute something like this: 20% of users are satisfied when dwell time is <5s, 35% for 5-10s, 75% for 10-60s, 98% for 1m-5m, 85% for 5m-20m, and 80% for >20m.
Determining the cutoffs might be tricky, and computation is more complex than counting, but not ridiculously complicated, and potentially much more accurate for large samples. Presenting the results is still easy: "54.7% of our users are happy with their search results based on our dwell-time model".
I tried to do a quick search for papers on this topic, but I didn't find anything. I'm not familiar with the literature, so that may not mean much.
Okay, back to the TextCat mines....
—Trey
Trey Jones Software Engineer, Discovery Wikimedia Foundation
discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery
Cool! I guess I had in my mind from previous and incomplete conversation that the goal was a specific cutoff. Thanks for the clarification.
Trey Jones Software Engineer, Discovery Wikimedia Foundation
On Fri, Jan 22, 2016 at 4:05 PM, Mikhail Popov mpopov@wikimedia.org wrote:
That's actually our goal with quick surveys :P We want to ask users for their satisfaction with our search and then build a predictive model with satisfaction as the response variable and dwell time + other data as the predictor variables.
Right now we're stuck at the "get training data" step. Once that's resolved, we can do precisely what you described :D Then we'll have a daily estimate of user satisfaction (unobservable without direct user feedback) using data we can observe (browsing behavior).
Thanks, Mikhail
On Fri, Jan 22, 2016 at 11:19 AM, Trey Jones tjones@wikimedia.org wrote:
Yesterday in the quarterly review Dan mentioned that our current user satisfaction metric uses the somewhat arbitrary 10s dwell time cutoff for a successful search, and that we want to use a survey to correlate qualitative and quantitative values to pin down a better cutoff for our users. I don't remember whether Dan mentioned it, or I was just rehashing the notion on my own, but it may be difficult to pin down a specific cutoff.
A wild thought appears! Why do we have to pin down a specific cut off? Why can't we have a probabilistic user satisfaction metric? (Other then complexity and computational speed, which may be relevant.)
We have the ability to gather so much data that we could easily compute something like this: 20% of users are satisfied when dwell time is <5s, 35% for 5-10s, 75% for 10-60s, 98% for 1m-5m, 85% for 5m-20m, and 80% for >20m.
Determining the cutoffs might be tricky, and computation is more complex than counting, but not ridiculously complicated, and potentially much more accurate for large samples. Presenting the results is still easy: "54.7% of our users are happy with their search results based on our dwell-time model".
I tried to do a quick search for papers on this topic, but I didn't find anything. I'm not familiar with the literature, so that may not mean much.
Okay, back to the TextCat mines....
—Trey
Trey Jones Software Engineer, Discovery Wikimedia Foundation
discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery
-- *Mikhail Popov* // Data Analyst, Discovery https://www.mediawiki.org/wiki/Wikimedia_Discovery https://wikimediafoundation.org/
*Imagine a world in which every single human being can freely share in the **sum of all knowledge. That's our commitment.* Donate https://donate.wikimedia.org/.
discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery
Do we have a timeline on when we want to do those quick surveys?
Cheers,
Deb
On Fri, Jan 22, 2016 at 2:08 PM, Trey Jones tjones@wikimedia.org wrote:
Cool! I guess I had in my mind from previous and incomplete conversation that the goal was a specific cutoff. Thanks for the clarification.
Trey Jones Software Engineer, Discovery Wikimedia Foundation
On Fri, Jan 22, 2016 at 4:05 PM, Mikhail Popov mpopov@wikimedia.org wrote:
That's actually our goal with quick surveys :P We want to ask users for their satisfaction with our search and then build a predictive model with satisfaction as the response variable and dwell time + other data as the predictor variables.
Right now we're stuck at the "get training data" step. Once that's resolved, we can do precisely what you described :D Then we'll have a daily estimate of user satisfaction (unobservable without direct user feedback) using data we can observe (browsing behavior).
Thanks, Mikhail
On Fri, Jan 22, 2016 at 11:19 AM, Trey Jones tjones@wikimedia.org wrote:
Yesterday in the quarterly review Dan mentioned that our current user satisfaction metric uses the somewhat arbitrary 10s dwell time cutoff for a successful search, and that we want to use a survey to correlate qualitative and quantitative values to pin down a better cutoff for our users. I don't remember whether Dan mentioned it, or I was just rehashing the notion on my own, but it may be difficult to pin down a specific cutoff.
A wild thought appears! Why do we have to pin down a specific cut off? Why can't we have a probabilistic user satisfaction metric? (Other then complexity and computational speed, which may be relevant.)
We have the ability to gather so much data that we could easily compute something like this: 20% of users are satisfied when dwell time is <5s, 35% for 5-10s, 75% for 10-60s, 98% for 1m-5m, 85% for 5m-20m, and 80% for >20m.
Determining the cutoffs might be tricky, and computation is more complex than counting, but not ridiculously complicated, and potentially much more accurate for large samples. Presenting the results is still easy: "54.7% of our users are happy with their search results based on our dwell-time model".
I tried to do a quick search for papers on this topic, but I didn't find anything. I'm not familiar with the literature, so that may not mean much.
Okay, back to the TextCat mines....
—Trey
Trey Jones Software Engineer, Discovery Wikimedia Foundation
discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery
-- *Mikhail Popov* // Data Analyst, Discovery https://www.mediawiki.org/wiki/Wikimedia_Discovery https://wikimediafoundation.org/
*Imagine a world in which every single human being can freely share in the **sum of all knowledge. That's our commitment.* Donate https://donate.wikimedia.org/.
discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery
discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery
On 22 January 2016 at 13:19, Deborah Tankersley dtankersley@wikimedia.org wrote:
Do we have a timeline on when we want to do those quick surveys?
That was my next question! :-)
Adam from Reading asked me to put together our list of blockers so that Reading is aware of them. I wasn't really driving the survey forwards, so I'm not aware of what the technical blockers were. Could someone (Julien?) point to the relevant tasks/patches?
Thanks, Dan
The top-level task that was being used is: https://phabricator.wikimedia.org/T118800 ("Add a survey on the article page when coming from a wiki search")
It lists 3 things that appear at a glance to be technical blockers, but Julien and others should be able to clarify.
Kevin Smith Agile Coach, Wikimedia Foundation
On Fri, Jan 22, 2016 at 1:22 PM, Dan Garry dgarry@wikimedia.org wrote:
On 22 January 2016 at 13:19, Deborah Tankersley <dtankersley@wikimedia.org
wrote:
Do we have a timeline on when we want to do those quick surveys?
That was my next question! :-)
Adam from Reading asked me to put together our list of blockers so that Reading is aware of them. I wasn't really driving the survey forwards, so I'm not aware of what the technical blockers were. Could someone (Julien?) point to the relevant tasks/patches?
Thanks, Dan
-- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation
discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery