( https://meta.wikimedia.org/wiki/User:Pine )
---------- Forwarded message ---------
From: Sarah R <srodlund(a)wikimedia.org>
Date: Fri, Aug 10, 2018 at 10:46 PM
Subject: [Analytics] Wikimedia Research Showcase August 13 2018 at 11:30 AM
(PDT) 18:30 UTC
To: <wikimedia-l(a)lists.wikimedia.org>, <wiki-research-l(a)lists.wikimedia.org>,
The next Wikimedia Research Showcase will be live-streamed Wednesday,
August 13 2018 at 11:30 AM (PDT) 18:30 UTC.
YouTube stream: https://www.youtube.com/watch?v=OGPMS4YGDMk
As usual, you can join the conversation on IRC at #wikimedia-research. And,
you can watch our past research showcases here.
Hope to see you there!
This month's presentations is:
*Quicksilver: Training an ML system to generate draft Wikipedia articles
and Wikidata entries simultaneously*
John Bohannon and Vedant Dharnidharka, Primer
The automatic generation and updating of Wikipedia articles is usually
approached as a multi-document summarization task: Given a set of source
documents containing information about an entity, summarize the entity.
Purely sequence-to-sequence neural models can pull that off, but getting
enough data to train them is a challenge. Wikipedia articles and their
reference documents can be used for training, as was recently done
<https://arxiv.org/abs/1801.10198> by a team at Google AI. But how do you
find new source documents for new entities? And besides having humans read
all of the source documents, how do you fact-check the output? What is
needed is a self-updating knowledge base that learns jointly with a
summarization model, keeping track of data provenance. Lucky for us, the
world’s most comprehensive public encyclopedia is tightly coupled with
Wikidata, the world’s most comprehensive public knowledge base. We have
built a system called Quicksilver uses them both.
Analytics mailing list