Re: [Wikimedia-l] [Wikimedia Research Showcase] May 20, 2020: Human in the Loop Machine Learning

18 May 2020

Just a reminder that this is happening on Wednesday. There will be a Q&A,
which might be especially nice if you're in an area that's still socially
distancing. ;)

On Fri, May 15, 2020 at 1:04 PM Janna Layton &lt;jlayton(a)wikimedia.org&gt; wrote:

...
  Hi all,

 The next Research Showcase will be live-streamed on Wednesday, May 20, at
 9:30 AM PDT/16:30 UTC.

 This month we will learn about recent research on machine learning systems
 that rely on human supervision for their learning and optimization -- a
 research area commonly referred to as Human-in-the-Loop ML. In the first
 talk, Jie Yang will present a computational framework that relies on
 crowdsourcing to identify influencers in Social Networks (Twitter) by
 selectively obtaining labeled data. In the second talk, Estelle Smith will
 discuss the role of the community in maintaining ORES, the machine learning
 system that predicts the quality in Wikipedia applications.

 YouTube stream: https://www.youtube.com/watch?v=8nDiu2ebdOI

 As usual, you can join the conversation on IRC at #wikimedia-research. You
 can also watch our past research showcases here:
 https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase

 This month's presentations:

 *OpenCrowd: A Human-AI Collaborative Approach for Finding Social
 Influencers via Open-Ended Answers Aggregation*

 By: Jie Yang, Amazon (current), Delft University of Technology (starting
 soon)

 Finding social influencers is a fundamental task in many online
 applications ranging from brand marketing to opinion mining. Existing
 methods heavily rely on the availability of expert labels, whose collection
 is usually a laborious process even for domain experts. Using open-ended
 questions, crowdsourcing provides a cost-effective way to find a large
 number of social influencers in a short time. Individual crowd workers,
 however, only possess fragmented knowledge that is often of low quality. To
 tackle those issues, we present OpenCrowd, a unified Bayesian framework
 that seamlessly incorporates machine learning and crowdsourcing for
 effectively finding social influencers. To infer a set of influencers,
 OpenCrowd bootstraps the learning process using a small number of expert
 labels and then jointly learns a feature-based answer quality model and the
 reliability of the workers. Model parameters and worker reliability are
 updated iteratively, allowing their learning processes to benefit from each
 other until an agreement on the quality of the answers is reached. We
 derive a principled optimization algorithm based on variational inference
 with efficient updating rules for learning OpenCrowd parameters.
 Experimental results on finding social influencers in different domains
 show that our approach substantially improves the state of the art by 11.5%
 AUC. Moreover, we empirically show that our approach is particularly useful
 in finding micro-influencers, who are very directly engaged with smaller
 audiences.

 Paper: https://dl.acm.org/doi/fullHtml/10.1145/3366423.3380254

 *Keeping Community in the Machine-Learning Loop*

 By:  C. Estelle Smith, MS, PhD Candidate, GroupLens Research Lab at the
 University of Minnesota

 On Wikipedia, sophisticated algorithmic tools are used to assess the
 quality of edits and take corrective actions. However, algorithms can fail
 to solve the problems they were designed for if they conflict with the
 values of communities who use them. In this study, we take a
 Value-Sensitive Algorithm Design approach to understanding a
 community-created and -maintained machine learning-based algorithm called
 the Objective Revision Evaluation System (ORES)—a quality prediction system
 used in numerous Wikipedia applications and contexts. Five major values
 converged across stakeholder groups that ORES (and its dependent
 applications) should: (1) reduce the effort of community maintenance, (2)
 maintain human judgement as the final authority, (3) support differing
 peoples’ differing workflows, (4) encourage positive engagement with
 diverse editor groups, and (5) establish trustworthiness of people and
 algorithms within the community. We reveal tensions between these values
 and discuss implications for future research to improve algorithms like
 ORES.

 Paper:

https://commons.wikimedia.org/wiki/File:Keeping_Community_in_the_Loop-_Unde…

 --
 Janna Layton (she, her)
 Administrative Assistant - Product & Technology
 Wikimedia Foundation <https://wikimediafoundation.org/>

-- 
Janna Layton (she, her)
Administrative Assistant - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

Re: [Wikimedia-l] [Wikimedia Research Showcase] May 20, 2020: Human in the Loop Machine Learning