Hello all,
This month's Wikimedia Research Showcase will be held on Wednesday, June
23, at 16:30 UTC (9:30am PDT). The theme is "AI model governance" with
presentations from Haiyi Zhu of Carnegie Mellon University and Andy Craze
and Chris Albon of the Wikimedia Foundation's Machine Learning Team.
Youtube: https://www.youtube.com/watch?v=USSBuwebWt4
Mediawiki:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#June_2021
Talk 1
Speaker: Haiyi Zhu (Carnegie Mellon University)
Title: Bridging AI and HCI: Incorporating Human Values into the Development
of AI Technologies
Abstract: The increasing accuracy and falling costs of AI have stimulated
the increased use of AI technologies in mainstream user-facing applications
and services. However, there is a disconnect between mathematically
rigorous AI approaches and the human stakeholders’ needs, motivations, and
values, as well as organizational and institutional realities, contexts,
and constraints; this disconnect is likely to undermine practical
initiatives and may sometimes lead to negative societal impacts. In this
presentation, I will discuss my research on incorporating human
stakeholders’ values and feedback into the creation process of AI
technologies. I will describe a series of projects in the context of the
Wikipedia community to illustrate my approach. I hope this presentation
will contribute to the rich ongoing conversation concerning bridging HCI
and AI and using HCI methods to address AI challenges.
Talk 2
Speaker: Andy Craze, Chris Albon (Wikimedia Foundation, Machine Learning
Team)
Title: ML Governance: First Steps
Abstract: The WMF Machine Learning team is upgrading the Foundation's
infrastructure to support the modern machine learning ecosystem. As part of
this work, the team seeks to understand its ethical and legal
responsibilities for developing and hosting predictive models within a
global context. Drawing from previous WMF research related to ethical &
human-centered machine learning, the team wishes to begin a series of
conversations to discuss how we can deploy responsible systems that are
inclusive to newcomers and non-experts, while upholding our commitment to
free and open knowledge.
--
Janna Layton (she/her)
Administrative Associate - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>
Good morning,
I have recently received approval and can now begin recruiting participants
for a study that seeks to explore the information behaviour of people who
access Wikipedia’s health and medical content. This study applies theories
and models of information behaviour from the discipline of library and
information science.
I kindly ask that you share this study widely within your personal and
professional networks as I require 25-30 participants. Participants will be
asked to complete a short online survey and participate in an interview
over Zoom. Participants who complete both the survey and the interview will
receive a $25 CAD gift card to a coffee shop of their choice.
I have attached the recruitment poster here. Details about the study can be
found on the study’s project page
<https://meta.wikimedia.org/wiki/Research:Wikipedia_and_consumer_health_info…>
and
those interested in participating can email wikistudy3(a)gmail.com
Thanks very much,
Denise
The Social Data Science program at the University of Oxford is hiring!
Below is the link for an open position in the MSc program in Social Data
Science. This position is suitable for someone in their early or mid-career
stage. It is a three-year fixed term position in the first instance and
funded directly by the Oxford Internet Institute.
Our SDS program is expanding and finding ever new ways to consider the
relationships between the computational and social sciences while tackling
problems of social and political importance, particularly with respect to
social inequality and internet governance. This position will be an
exciting and hopefully rewarding opportunity to expand the scope of our
interdisciplinary program and mentor exceptional students while building
your own research profile.
Post link:
https://my.corehr.com/pls/uoxrecruit/erq_jobspec_details_form.jobspec?p_id=…
The post closes July 15th, with shortlisting to follow in the week
thereafter. Please feel free to circulate this post among your networks.
Best wishes,
Scott
--
Dr Scott A. Hale
Associate Professor and Senior Research Fellow,
Co-Director of the International Multimodal Communication Centre
Oxford Internet Institute, University of Oxford
SCR Member, St Antony's College, Oxford
Turing Fellow, Alan Turing Institute
http://scott.hale.us/
scott.hale(a)oii.ox.ac.uk
The Public Service Media and Public Service Internet Manifesto
The Internet and the media landscape are broken. The dominant commercial
Internet platforms endanger democracy. Despite all the great
opportunities the Internet has offered to society and individuals, the
digital giants have acquired unparalleled economic, political and
cultural power. As currently organised, the Internet separates and
divides instead of creating common spaces for negotiating difference and
disagreement. Democracy requires Public Service Media and a Public
Service Internet.
The Public Service Media and Public Service Internet Manifesto:
http://bit.ly/psmmanifesto
Please sign the Manifesto:
http://bit.ly/signPSManifesto
Occupy the Internet: 200 Media Experts Publish an Alarming Wake Up Call
and Demand a Public Service Internet
https://pressat.co.uk/releases/occupy-the-internet-200-media-experts-publis…
Launch event (video):
https://www.youtube.com/watch?v=i0kiilUrF9o
Hello everybody,
I'd like to announce a new dataset of Wikipedia pageviews that shows trends
in search engine usage across countries, language editions, internet
browsers, and device operating systems (
https://techblog.wikimedia.org/2021/06/07/searching-for-wikipedia/). We are
releasing this dataset within the context of the Foundational program [1]
and in particular research focused on Wikimedia's relationship with
external platforms [2]. This dataset fills a large gap in both public
information about global search engine usage and our understanding of the
external platforms that mediate how readers reach Wikipedia.
For context, approximately 50% of Wikipedia pageviews come directly from
clicking on links on search engines (with another 30% coming from clicking
on internal links within Wikipedia and the rest from external sites or
direct navigation) [3]. For those 50% of pageviews coming from search
engines, this dataset (and corresponding dashboard) allows you to examine
which search engines were used and how those patterns vary by country,
Wikipedia language, internet browser, and device operating system.
If you have any questions about these datasets or related projects please
feel free to contact me. More details here:
https://techblog.wikimedia.org/2021/06/07/searching-for-wikipedia/
Best,
Isaac
[1] https://research.wikimedia.org/foundational.html
[2]
https://meta.wikimedia.org/wiki/Research:External_Reuse_of_Wikimedia_Content
[3] https://discovery.wmflabs.org/external/
Launch of The Public Service Media and Public Service Internet Manifesto.
Online event, Thursday 17 June 2021, 16:00 UK time, 17:00 Central
European Time
https://www.eventbrite.co.uk/e/launch-of-the-public-service-media-and-publi…
This event launches “The Public Service Media and Public Service
Internet Manifesto”.
The Internet and the media landscape are broken. The dominant commercial
Internet platforms endanger democracy. The Manifesto stresses the
importance of public service media and the creation of a public service
Internet for the future of society and safeguarding democracy.
In the online event, media experts will talk about why they support and
signed the Manifesto that is the outcome of a discussion and
collaboration process organised as part of the AHRC research network
InnoPSM: Innovation in Public Service Media Policies.
With interventions by Alessandro D'Arma, Roy Cobby Avaria, Leonhard
Dobusch, Christian Fuchs, Minna Horowitz, Luciana Musello, Jack L. Qiu,
Barbara Thomass
The event takes place on Zoom. After registering on Eventbrite, you will
receive the Zoom access data at latest one day in advance of the event.
The audience of the event will have the opportunity to be among the
first to read and sign the Public Service Media and Public Service
Internet Manifesto.
Hello everybody,
Within the context of the Knowledge Integrity program
<https://research.wikimedia.org/knowledge-integrity.html>, the Research
Team (and our formal collaborators
<https://www.mediawiki.org/wiki/Wikimedia_Research/Formal_collaborations>)
has been working on releasing relevant datasets on this area.
Recently we have published the following datasets:
-
Tracking Knowledge Propagation Across Wikipedia Languages: A dataset of
inter-language knowledge propagation in Wikipedia. Covering the entire 309
language editions and 33M articles, the dataset aims to track the full
propagation history of Wikipedia concepts, and allow follow up research on
building predictive models of them. For this purpose, we align all the
Wikipedia articles in a language-agnostic manner according to the concept
they cover, their topic, and the timestamp of each article creation, which
results in 13M propagation instances. (paper
<https://arxiv.org/abs/2103.16613>, dataset
<https://zenodo.org/record/4433137>, code
<https://github.com/rodolfovalentim/wikipedia-content-propagation>, meta
<https://meta.wikimedia.org/wiki/Research:Exploration_on_content_propagation…>
)
-
Wiki-Reliability: A Large Scale Dataset for Content Reliability on
(English) Wikipedia: We selected the 10 most popular reliability-related
templates on English Wikipedia, and propose an effective method to label
almost 1M samples of Wikipedia article revisions as positive or negative
with respect to each template. Each positive/negative example in the
dataset comes with the full article text and 20 features from the
revision's metadata (paper <https://arxiv.org/abs/2105.04117>, dataset
<https://figshare.com/articles/dataset/Wiki-Reliability_A_Large_Scale_Datase…>,
code <https://github.com/kay-wong/Wiki-Reliability/>, meta
<https://meta.wikimedia.org/wiki/Research:Wiki-Reliability:_A_Large_Scale_Da…>
).
We hope that these datasets can be used by the research community to keep
working on understanding and modeling knowledge integrity in Wikipedia.
Currently we are working on expanding both datasets. For knowledge
propagation, we are characterizing the different types of cascades, and
generating new prediction models. For the Wiki-Reliability dataset, we are
currently working on expanding this to more languages.
If you have any questions about these datasets or related projects please
feel free to contact me.
Best,
--
Diego Sáez Trumper
Senior Research Scientist
Wikimedia Foundation.
Hi all,
Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours on 2021-06-01 at 16:00-17:00 UTC (9am PT/6pm CEST).
To participate, join the video-call via this link [2]. There is no set
agenda - feel free to add your item to the list of topics in the etherpad
[3] (You can do this after you join the meeting, too.), otherwise you are
welcome to also just hang out. More detailed information (e.g. about how to
attend) can be found here [4].
Through these office hours, we aim to make ourselves more available to
answer some of the research related questions that you as Wikimedia
volunteer editors, organizers, affiliates, staff, and researchers face in
your projects and initiatives. Some example cases we hope to be able to
support you in:
-
You have a specific research related question that you suspect you
should be able to answer with the publicly available data and you don’t
know how to find an answer for it, or you just need some more help with it.
For example, how can I compute the ratio of anonymous to registered editors
in my wiki?
-
You run into repetitive or very manual work as part of your Wikimedia
contributions and you wish to find out if there are ways to use machines to
improve your workflows. These types of conversations can sometimes be
harder to find an answer for during an office hour, however, discussing
them can help us understand your challenges better and we may find ways to
work with each other to support you in addressing it in the future.
-
You want to learn what the Research team at the Wikimedia Foundation
does and how we can potentially support you. Specifically for affiliates:
if you are interested in building relationships with the academic
institutions in your country, we would love to talk with you and learn
more. We have a series of programs that aim to expand the network of
Wikimedia researchers globally and we would love to collaborate with those
of you interested more closely in this space.
-
You want to talk with us about one of our existing programs [5].
Hope to see many of you,
Martin on behalf of the WMF Research Team
[1] https://research.wikimedia.org/team.html
[2] https://meet.jit.si/WMF-Research-Office-Hours
[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours
[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours
[5] https://research.wikimedia.org/projects.html
--
Martin Gerlach
Research Scientist
Wikimedia Foundation