Greetings,

Moving discussion from irc to email for added transparency and visibility...

Previously on irc:

tfinc 

13:45 Deskana: so much really interesting talk about search on https://meta.wikimedia.org/wiki/Talk:2016_Strategy/Reach#NaBUru38

13:46 https://meta.wikimedia.org/wiki/Talk:2016_Strategy/Reach

13:46 less about that specific post and more about the conversations in general

13:46 i see lot of people who could help us test and move with next steps


JustinO

13:49 that talk is actually what reminded me to check in with you folks and see if you wanted assistance in the relevance area


tfinc

13:51 JustinO: greetings. we can always use wise guidance and help to make our users and donors proud. what do you have in mind ?


JustinO

13:52 last year I was talking with a couple of folks after elasticon

13:53 and we were going thru the first steps like which metrics are useful to track


jgirault

13:54 debt: OuKB: jan_drewniak: besides a varnish issue with images, the page with separate JS file is on beta http://www.wikipedia.beta.wmflabs.org/


tfinc

13:55 JustinO: ebernhardson and i will be at this years elasticon

13:56 JustinO: we've been looking at a number of interesting metrics to validate user satisfaction for our search relevance. bearloga can tell you plenty about it


JustinO

13:57 awesome. i looked thru some of your docs. tracking dwell time is great as it opens up a whole host of useful metrics


ebernhardson

13:57 JustinO: we almost certainly need help in relevane :) we are currently hitting some very high level things, but we need to to a lot more in terms of collecting and measuring relevance (both from users, and in back testing for new features) to do well moving forward


bearloga

13:58 JustinO: we're tracking dwell time and clickthrough rate. we hope to get some qualitative user feedback to correlate that with the quantitative data we're tracking


JustinO

13:58 with that you can infer good clicks vs. bad clicks. which leads to a session success rate, time to success, etc. and in the long run gives you a training set to do offline evaluations and in the long term, machine learned rankers


jgirault

13:59 the deploy-to-prod patch would be: https://gerrit.wikimedia.org/r/268804


tfinc

13:59 JustinO: Trey314159 has worked a bit on creating a base line relevance lab to do offline evaluations between different ranking/sorting/etc algorithms


JustinO

14:00 @bearloga: one simple way of qualitative feed back is the simple "how was you search today?" message


jan_drewniak

14:01 jgirault: like someone once said, the hardest things in programming are cache invalidation and naming things 


JustinO

14:01 @tfink: offline evals are very useful. creating a hand generated judgment set with cleans labels takes time but pays off


ebernhardson

14:01 we also do track which position the user clicked, in addition to dwell time. But i don't think we are doing anything with that information yet


bearloga

14:02 JustinO: the question we're going to ask is basically that but we're working on rolling out that feedback system


jgirault

14:02 jan_drewniak: and choosing between spaces and tabs


JustinO

14:04 ebernhardson: i think i was suggesting tracking {query, all results, position clicked, dwell time on the clicked page, userid, time from from pageload to click}


jgirault

14:04 alright, so I'm gonna head to the office now. Once I get there, I'll try to find someone to push that to prod. Meanwhile, if you have time jan_drewniak you can sanity check the latest master


Trey314159

14:04 JustinO: Hey! Sorry Dan (Deskana) and I haven't gotten back to your email yet. It's been a busy week, and there's a lot of stuff but not a lot of context to that email thread.


JustinO

14:04 @Trey314159: no worries


Trey314159

14:04 Fortunately, James outlined your conversation: https://meta.wikimedia.org/wiki/Schema_talk:Search#Useful_metrics_to_track

14:05 (For anyone else who wants to take a look)


ebernhardson

14:05 JustinO: interesting, i think we are collecting most of those, but not the all results or the user id. We do collect a token that is a short-term proxy for the user id though


JustinO

14:05 an anonymous token for the id is great


ebernhardson

14:05 JustinO: i'm curious, by all results you mean (in our case) a list of page titles or id's?


Ironholds

14:05 JustinO, can I ask you move this to the mailing list or email myself or bearloga? We can explain what we're already tracking, what we're planning on tracking, and you can chip in feedback


ebernhardson

14:05 i hadn't thought of that, but it makes sense


JustinO

14:05 @ebernhardson : pageids i suppose, i'm not sure what's best for wikimedia


Ironholds

14:06 at the moment this is kind of duplicative because you don't know what we're tracking in advance of suggesting we track it ;p


ebernhardson

14:06 the current schema is here: https://meta.wikimedia.org/wiki/Schema:TestSearchSatisfaction2

14:06 the descriptions could be better, but give a general idea 


JustinO

14:07 ebernhardson: session id is prob fine for a userid unless you want to get towards personalization in the long run. eg: give coders more pages related to tech


Trey314159

14:07 Ironholds: to be fair, JustinO suggested we track it long before we actually did (early last year).. but I agree this might be a better conversation on the mailing list, definitely including Ironholds and bearloga, and not late on a Friday afternoon (local time for me, at least)


Ironholds

14:07 JustinO, yep, we've tested session IDs. We know these things ;p


Ironholds

14:08 let's chat on the mailing lists where conversations can be seen by other users/helpers for transparency purposes, and we can be async to avoid time drains


JustinO

14:08 yeah, i'm assuming you've put lots of thought into the topics


Ironholds

14:09 https://lists.wikimedia.org/mailman/listinfo/discovery for reference


JustinO

14:09 yep


Ironholds

14:10 (our mailing list infrastructure makes it a nightmare to find anything. I just use google ;p)


JustinO

14:10 i maybe on there


Ironholds

14:10 (...appropriate for the discovery team I guess)


bearloga

14:10 chuckles


ebernhardson

14:10 Ironholds: while i don't expect it will make it into prod (change is hard) there is a test instance is discourse that could plausibly replace mailling lists and be more discoverable

14:11 https://discourse.wmflabs.org/


Ironholds

14:11 cool!


--justin