Thanks, Erik!
For those who want to jump to *just* the right spot:
- Aaron starts his presentation at 28m38s https://www.youtube.com/watch?v=rsFmqYxtt9w&t=28m38s - He talks about bad signals as part 2 of 3 at 42m02s https://www.youtube.com/watch?v=rsFmqYxtt9w&t=42m02s, covering Italian "ha" and anonymous users - He talks about anon specifically at 47m15s https://www.youtube.com/watch?v=rsFmqYxtt9w&t=47m15s —and that section ends at about 55m, so it's only 8 minutes of video to watch (~4 minutes at 2x)
Some key things are the differences in robustness between the two types of models, and the testing by holding one feature constant to assess its global effect. Neat stuff.
—Trey
Trey Jones Sr. Software Engineer, Search Platform Wikimedia Foundation
On Wed, Jan 31, 2018 at 2:23 PM, Erik Bernhardson < ebernhardson@wikimedia.org> wrote:
We talked about this briefly in our meeting today, i got the links from Aaron for the original slides[1] and talk[2] (starts about half way through). i don't think we are using any features with as strong of a bias as the logged in bit for ores, but still something to think about.
[1] https://www.mediawiki.org/wiki/File:Deploying_and_ maintaining_AI_in_a_socio-technical_system_--_Research_ Showcase_(August_2016).pdf [2] https://www.youtube.com/watch?v=rsFmqYxtt9w
Discovery mailing list Discovery@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/discovery