Probabilistic User Satisfaction?

List overview All Threads
Download

newer

older

Dashboards backlogged

How Wrong Would Using Out of Date...

Trey Jones

22 Jan 2016 22 Jan '16

7:19 p.m.

Yesterday in the quarterly review Dan mentioned that our current user satisfaction metric uses the somewhat arbitrary 10s dwell time cutoff for a successful search, and that we want to use a survey to correlate qualitative and quantitative values to pin down a better cutoff for our users. I don't remember whether Dan mentioned it, or I was just rehashing the notion on my own, but it may be difficult to pin down a specific cutoff.

A wild thought appears! Why do we have to pin down a specific cut off? Why can't we have a probabilistic user satisfaction metric? (Other then complexity and computational speed, which may be relevant.)

We have the ability to gather so much data that we could easily compute something like this: 20% of users are satisfied when dwell time is <5s, 35% for 5-10s, 75% for 10-60s, 98% for 1m-5m, 85% for 5m-20m, and 80% for >20m.

Determining the cutoffs might be tricky, and computation is more complex than counting, but not ridiculously complicated, and potentially much more accurate for large samples. Presenting the results is still easy: "54.7% of our users are happy with their search results based on our dwell-time model".

I tried to do a quick search for papers on this topic, but I didn't find anything. I'm not familiar with the literature, so that may not mean much.

Okay, back to the TextCat mines....

—Trey

Trey Jones Software Engineer, Discovery Wikimedia Foundation

Attachments:

attachment.htm (text/html — 1.9 KB)

Show replies by date

Mikhail Popov

22 Jan 22 Jan

9:05 p.m.

That's actually our goal with quick surveys :P We want to ask users for their satisfaction with our search and then build a predictive model with satisfaction as the response variable and dwell time + other data as the predictor variables.

Right now we're stuck at the "get training data" step. Once that's resolved, we can do precisely what you described :D Then we'll have a daily estimate of user satisfaction (unobservable without direct user feedback) using data we can observe (browsing behavior).

Thanks, Mikhail

On Fri, Jan 22, 2016 at 11:19 AM, Trey Jones tjones@wikimedia.org wrote:

...

Yesterday in the quarterly review Dan mentioned that our current user satisfaction metric uses the somewhat arbitrary 10s dwell time cutoff for a successful search, and that we want to use a survey to correlate qualitative and quantitative values to pin down a better cutoff for our users. I don't remember whether Dan mentioned it, or I was just rehashing the notion on my own, but it may be difficult to pin down a specific cutoff.

A wild thought appears! Why do we have to pin down a specific cut off? Why can't we have a probabilistic user satisfaction metric? (Other then complexity and computational speed, which may be relevant.)

We have the ability to gather so much data that we could easily compute something like this: 20% of users are satisfied when dwell time is <5s, 35% for 5-10s, 75% for 10-60s, 98% for 1m-5m, 85% for 5m-20m, and 80% for >20m.

Determining the cutoffs might be tricky, and computation is more complex than counting, but not ridiculously complicated, and potentially much more accurate for large samples. Presenting the results is still easy: "54.7% of our users are happy with their search results based on our dwell-time model".

I tried to do a quick search for papers on this topic, but I didn't find anything. I'm not familiar with the literature, so that may not mean much.

Okay, back to the TextCat mines....

—Trey

Trey Jones Software Engineer, Discovery Wikimedia Foundation

discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery

-- *Mikhail Popov* // Data Analyst, Discovery https://www.mediawiki.org/wiki/Wikimedia_Discovery https://wikimediafoundation.org/ *Imagine a world in which every single human being can freely share in the **sum of all knowledge. That's our commitment.* Donate https://donate.wikimedia.org/.

Trey Jones

9:08 p.m.

Cool! I guess I had in my mind from previous and incomplete conversation that the goal was a specific cutoff. Thanks for the clarification.

Trey Jones Software Engineer, Discovery Wikimedia Foundation

On Fri, Jan 22, 2016 at 4:05 PM, Mikhail Popov mpopov@wikimedia.org wrote:

...

That's actually our goal with quick surveys :P We want to ask users for their satisfaction with our search and then build a predictive model with satisfaction as the response variable and dwell time + other data as the predictor variables.

Right now we're stuck at the "get training data" step. Once that's resolved, we can do precisely what you described :D Then we'll have a daily estimate of user satisfaction (unobservable without direct user feedback) using data we can observe (browsing behavior).

Thanks, Mikhail

On Fri, Jan 22, 2016 at 11:19 AM, Trey Jones tjones@wikimedia.org wrote:

...
Yesterday in the quarterly review Dan mentioned that our current user satisfaction metric uses the somewhat arbitrary 10s dwell time cutoff for a successful search, and that we want to use a survey to correlate qualitative and quantitative values to pin down a better cutoff for our users. I don't remember whether Dan mentioned it, or I was just rehashing the notion on my own, but it may be difficult to pin down a specific cutoff.

A wild thought appears! Why do we have to pin down a specific cut off? Why can't we have a probabilistic user satisfaction metric? (Other then complexity and computational speed, which may be relevant.)

We have the ability to gather so much data that we could easily compute something like this: 20% of users are satisfied when dwell time is <5s, 35% for 5-10s, 75% for 10-60s, 98% for 1m-5m, 85% for 5m-20m, and 80% for >20m.

Determining the cutoffs might be tricky, and computation is more complex than counting, but not ridiculously complicated, and potentially much more accurate for large samples. Presenting the results is still easy: "54.7% of our users are happy with their search results based on our dwell-time model".

I tried to do a quick search for papers on this topic, but I didn't find anything. I'm not familiar with the literature, so that may not mean much.

Okay, back to the TextCat mines....

—Trey

Trey Jones Software Engineer, Discovery Wikimedia Foundation

discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery

-- *Mikhail Popov* // Data Analyst, Discovery https://www.mediawiki.org/wiki/Wikimedia_Discovery https://wikimediafoundation.org/

*Imagine a world in which every single human being can freely share in the **sum of all knowledge. That's our commitment.* Donate https://donate.wikimedia.org/.

discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery

Deborah Tankersley

9:19 p.m.

Do we have a timeline on when we want to do those quick surveys?

Cheers,

Deb

On Fri, Jan 22, 2016 at 2:08 PM, Trey Jones tjones@wikimedia.org wrote:

...

Cool! I guess I had in my mind from previous and incomplete conversation that the goal was a specific cutoff. Thanks for the clarification.

Trey Jones Software Engineer, Discovery Wikimedia Foundation

On Fri, Jan 22, 2016 at 4:05 PM, Mikhail Popov mpopov@wikimedia.org wrote:

...
That's actually our goal with quick surveys :P We want to ask users for their satisfaction with our search and then build a predictive model with satisfaction as the response variable and dwell time + other data as the predictor variables.

Right now we're stuck at the "get training data" step. Once that's resolved, we can do precisely what you described :D Then we'll have a daily estimate of user satisfaction (unobservable without direct user feedback) using data we can observe (browsing behavior).

Thanks, Mikhail

On Fri, Jan 22, 2016 at 11:19 AM, Trey Jones tjones@wikimedia.org wrote:

...
Yesterday in the quarterly review Dan mentioned that our current user satisfaction metric uses the somewhat arbitrary 10s dwell time cutoff for a successful search, and that we want to use a survey to correlate qualitative and quantitative values to pin down a better cutoff for our users. I don't remember whether Dan mentioned it, or I was just rehashing the notion on my own, but it may be difficult to pin down a specific cutoff.

A wild thought appears! Why do we have to pin down a specific cut off? Why can't we have a probabilistic user satisfaction metric? (Other then complexity and computational speed, which may be relevant.)

We have the ability to gather so much data that we could easily compute something like this: 20% of users are satisfied when dwell time is <5s, 35% for 5-10s, 75% for 10-60s, 98% for 1m-5m, 85% for 5m-20m, and 80% for >20m.

Determining the cutoffs might be tricky, and computation is more complex than counting, but not ridiculously complicated, and potentially much more accurate for large samples. Presenting the results is still easy: "54.7% of our users are happy with their search results based on our dwell-time model".

I tried to do a quick search for papers on this topic, but I didn't find anything. I'm not familiar with the literature, so that may not mean much.

Okay, back to the TextCat mines....

—Trey

Trey Jones Software Engineer, Discovery Wikimedia Foundation

discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery

-- *Mikhail Popov* // Data Analyst, Discovery https://www.mediawiki.org/wiki/Wikimedia_Discovery https://wikimediafoundation.org/

*Imagine a world in which every single human being can freely share in the **sum of all knowledge. That's our commitment.* Donate https://donate.wikimedia.org/.

discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery

discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery

-- -- Deb Tankersley Product Manager, Discovery Wikimedia Foundation

Dan Garry

9:22 p.m.

On 22 January 2016 at 13:19, Deborah Tankersley dtankersley@wikimedia.org wrote:

...

Do we have a timeline on when we want to do those quick surveys?

That was my next question! :-)

Adam from Reading asked me to put together our list of blockers so that Reading is aware of them. I wasn't really driving the survey forwards, so I'm not aware of what the technical blockers were. Could someone (Julien?) point to the relevant tasks/patches?

Thanks, Dan

-- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation

Kevin Smith

23 Jan 23 Jan

12:50 a.m.

The top-level task that was being used is: https://phabricator.wikimedia.org/T118800 ("Add a survey on the article page when coming from a wiki search")

It lists 3 things that appear at a glance to be technical blockers, but Julien and others should be able to clarify.

Kevin Smith Agile Coach, Wikimedia Foundation

On Fri, Jan 22, 2016 at 1:22 PM, Dan Garry dgarry@wikimedia.org wrote:

...

On 22 January 2016 at 13:19, Deborah Tankersley <dtankersley@wikimedia.org

...
wrote:

...
Do we have a timeline on when we want to do those quick surveys?

That was my next question! :-)

Adam from Reading asked me to put together our list of blockers so that Reading is aware of them. I wasn't really driving the survey forwards, so I'm not aware of what the technical blockers were. Could someone (Julien?) point to the relevant tasks/patches?

Thanks, Dan

-- Dan Garry Lead Product Manager, Discovery Wikimedia Foundation

discovery mailing list discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery

3108

Age (days ago)

3109

Last active (days ago)

discovery@lists.wikimedia.org

5 comments

5 participants

tags (0)

participants (5)

Dan Garry
Deborah Tankersley
Kevin Smith
Mikhail Popov
Trey Jones