Tomorrow the SRE team will be carrying out an upgrade of the switches in
eqiad row B: (https://phabricator.wikimedia.org/T330165) at 14:00 UTC.
The network outage to this row resulting from this work is expected to
be around 30 minutes, all being well.
In support of this work, the Data Engineering team will be putting HDFS
file system into safe mode at approximately 13:30 UTC tomorrow, which
means that write operations to the cluster will be refused.
Jobs sent to the YARN cluster will also be refused from around the same
time, so please try to plan any work that you may have for the cluster
to avoid this maintenance window.
Some additional internal-facing services for analytics such as Hive,
Superset, Presto, and the Druid-analytics cluster will also be largely
unavailable for some periods while the switch upgrade takes place.
The public-facing Analytics Query Service (AQS) will continue to
function, albebeit with a degraded response to some queries. However
Wikistats (stats.wikimedia.org) will be unavailable whilst the switch
upgrade is in progress.
Finally, two of the stats servers, stat1007 and stat1009, will be
unavailable, so please save any work that you may have on these servers
before the loss of connectivity.
Please do reach out via any of the normal channels (email:
analytics(a)lists.wikimedia.org , IRC: #wikimedia-analytics , Slack
#data-engineering ) if you have any queries or concerns.
Senior Site Reliability Engineer
Wikimedia Foundation <https://wikimediafoundation.org/>
We encourage each and everyone of you to create a program submission. You can submit an interactive workshop or panel, a lecture, a short lighting talk or a poster for our dedicated poster session. Submissions are catered to both onsite and online (live or pre-recorded) or a hybrid combination. We would love to see submissions from all over the world, and this year there is an 'Open Data' track for projects relating to Linked Open Data.
The theme for this year's Wikimania is Diversity, Collaboration, Future. Topics that strengthen collaboration on Open Data including Data Analytics are topics we like to see this year.
Session submissions for Wikimania 2023 are open until 28 March.
Visit the following links for further info:
Wiki page: https://wikimania.wikimedia.org/wiki/2023:Program/Submissions
Diff post: https://diff.wikimedia.org/2023/02/28/be-part-of-the-wikimania-2023-program/
Program Submission Form: https://pretalx.com/wm2023/cfp
Chair, Program Subcommittee
Event lead, ESEAP Wikimania 2023 Core Organizing Team
Just a quick note that the Wikigrowth site has been updated to include wiki page creation data from 2021 and 2022.
Sorry for the two year hiatus. Any suggestions to improve the tool and make it more useful (or even merge it with a current site) are always welcome.
The next Research Showcase, focused on Gender and Equity on Wikipedia, will
be live-streamed Wednesday, March 15, at 9:30 AM PST / 16:30 UTC. Find your
local time here <https://zonestamp.toolforge.org/1678897840>.
YouTube stream: https://www.youtube.com/watch?v=lw4MzJgDIzo
You can join the conversation on IRC at #wikimedia-research. You can also
watch our past research showcases here:
This month's presentations:
Men Are elected, women are marriedː events gender bias on Wikipedia
By *Jiao Sun, University of Southern California*Human activities can be
seen as sequences of events, which are crucial to understanding societies.
Disproportional event distribution for different demographic groups can
manifest and amplify social stereotypes, and potentially jeopardize the
ability of members in some groups to pursue certain goals. In this paper,
we present the first event-centric study of gender biases in a Wikipedia
corpus. To facilitate the study, we curate a corpus of career and personal
life descriptions with demographic information consisting of 7,854
fragments from 10,412 celebrities. Then we detect events with a
state-of-the-art event detection model, calibrate the results using
strategically generated templates, and extract events that have asymmetric
associations with genders. Our study discovers that the Wikipedia pages
tend to intermingle personal life events with professional events for
females but not for males, which calls for the awareness of the Wikipedia
community to formalize guidelines and train the editors to mind the
implicit biases that contributors carry. Our work also lays the foundation
for future works on quantifying and discovering event biases at the corpus
- Paperː Sun, J. & Peng, N. (2021). Men Are Elected, Women Are Married:
Events Gender Bias on Wikipedia. Proceedings of the 59th Annual Meeting of
the Association for Computational Linguistics and the 11th International
Conference on Natural Language Processing, 350-360.
Twitter reacts to absence of women on Wikipediaː a mixed-methods analysis
of #VisibleWikiWomen campaignBy *Sneh Gupta, Guru Gobind Singh Indraprastha
University*Digital gender divide (DGD) is visible in access, participation,
representation, and biases against women embedded in Wikipedia, the largest
digital reservoir of co-created content. This article examined the content
of #VisibleWikiWomen, a global digital advocacy campaign aimed at
encouraging inclusion of women voices in the global technology conversation
and improving digital sustainability of feminist data on Wikipedia. In a
mixed-methods study, Sentiment Analysis followed by a Feminist Critical
Discourse Analysis of the campaign tweets reveals how digital gender divide
manifested in the public response. An overwhelming majority of tweets
expressed positive sentiment towards the objective of the campaign. An
inductive reading of the coded tweets (n = 1067) generated five themes:
Feminist Activism, Invisibility & Marginalization of Women, Technology for
Women Empowerment, Gendered Knowledge Inequity, and Power Dynamics in the
Digital Sphere. Twitter discourse presented many agitated digital users
calling out the epistemic injustice on Wikipedia that goes beyond the
invisibility of women. Their tweets reveal that they want an equal social
platform inclusive of women of color and varied identities currently absent
in the Wikipedia universe. Extracting ideas, values, and themes from new
media campaigns holds unparalleled potential in the diffusion of
interventions and messages on a larger scale.
- Paperː Gupta, S., & Trehan, K. (2022). Twitter reacts to absence of
women on Wikipedia: a mixed-methods analysis of #VisibleWikiWomen campaign.
Media Asia, 49(2), 130-154.
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
Hello and apologies for the short notice.
We are required to put HDFS into safe mode at approximately 13:50
UTC today, which means that the file system will be read-only.
This might be for as little as 30 minutes, but the maintenance window
we're working within is for up to 2 hours, so the actual period of
read-only access will depend on the outcome of the eqiad row A switches
upgrade (https://phabricator.wikimedia.org/T329073) by the
Infrastructure Foundations team.
We will be pausing ingestion to the Data Lake a little ahead of this
time, so there will be a delay in dataset availability on HDFS,
Cassandra, and Druid etc.
Apologies for any inconvenience that this disruption to service will
Please do let us know by reply to this list or in #wikimedia-analytics
on IRC if you have any queries, or would like to follow-along with our
support of the maintenance work.
Senior Site Reliability Engineer
Wikimedia Foundation <https://wikimediafoundation.org/>
Just curious if there is a known cause for the multiple long delays we've
had on the AQS API data being available this week? I know periodic delays
are not uncommon but these seem beyond normal levels.
Please see the call for papers for the 10th edition of Wiki Workshop below.
The call is for extended abstracts (2 pages) of ongoing or completed work.
The deadline is March 23. The submissions are non-archival which means you
can submit work that is already published as well! :)
Submit and join us in conversations about research on the Wikimedia
Head of Research
---------- Forwarded message ---------
From: Martin Gerlach <mgerlach(a)wikimedia.org>
Date: Mon, Feb 20, 2023 at 1:29 AM
Subject: [Wiki-research-l] [events] Wiki Workshop 2023 Call for Papers
The call for papers for the 10th Wiki Workshop in 2023 is out:
https://wikiworkshop.org/2023/#call Submit your 2-page abstracts by March
23 (all submissions are non-archival). The workshop will take place on May
11, 2023. For more information, see the workshop website .
If you have questions about the workshop, please let us know on this list
or at wikiworkshop(a)googlegroups.com.
Looking forward to seeing many of you in this year's edition.
Pablo Aragón, Wikimedia Foundation
Martin Gerlach, Wikimedia Foundation
Evelin Heidel, Wikimedistas de Uruguay
Emily Lescak, Wikimedia Foundation
Francesca Tripodi, University of North Carolina
Bob West, EPFL
Leila Zia, Wikimedia Foundation
We invite contributions to the 10th edition (!) of Wiki Workshop, which
will take place virtually on May 11, 2023 (tentatively 12:00-19:00 UTC).
Wiki Workshop is the largest Wikimedia research event of the year, aimed at
bringing together researchers who study all aspects of Wikimedia projects
(including, but not limited to, Wikipedia, Wikidata, Wikimedia Commons,
Wikisource, and Wiktionary) as well as Wikimedia developers, affiliate
organizations, and volunteer editors. Co-organized by the Wikimedia
Foundation’s Research team and members of the Wikimedia research community,
the workshop facilitates a direct pathway for exchanging ideas between the
organizations that serve Wikimedia projects and the researchers actively
studying them. New this year: Building on the successful experiences of
organizing Wiki Workshop in 2015 <https://wikiworkshop.org/2015/>, 2016
<https://wikiworkshop.org/2016/>, 2017 <https://wikiworkshop.org/2017/>,
2018 <https://wikiworkshop.org/2018/>, 2019 <https://wikiworkshop.org/2019/>
, 2020 <https://wikiworkshop.org/2020/>, 2021
<https://wikiworkshop.org/2021/>, and 2022 <https://wikiworkshop.org/2022/>
and based on feedback from authors and participants over the years, we are
introducing a few updates to the research track of the workshop for 2023:
This 10th edition will take place as a standalone event (rather than in
co-location with a conference, as in previous years).
We have changed the format of submissions and will only accept 2-page
extended abstracts (following the successful IC2S2 model).
Submissions are non-archival, so we welcome ongoing, completed, and
already published work.
We are excited to share that the authors of Wiki Workshop 2023 will have
the opportunity to receive feedback, improve their work, and submit the
extended version of their research paper to a special issue of the ACM
Transactions on the Web, which will have a dedicated open call for papers
later in 2023.
Topics include, but are not limited to:
new technologies and initiatives to grow content, quality, equity,
diversity, and participation across Wikimedia projects
use of bots, algorithms, and crowdsourcing strategies to curate, source,
or verify content and structured data
bias in content and gaps of knowledge on Wikimedia projects
relation between Wikimedia projects and the broader (open) knowledge
exploration of what constitutes a source and how/if the incorporation of
other kinds of sources are possible (e.g., oral histories, video)
detection of low-quality, promotional, or fake content (misinformation
or disinformation), as well as fake accounts (e.g., sock puppets)
questions related to community health (e.g., sentiment analysis,
harassment detection, tools that could increase harmony)
motivations, engagement models, incentives, and needs of editors,
readers, and/or developers of Wikimedia projects
innovative uses of Wikipedia and other Wikimedia projects for AI and NLP
applications and vice versa
consensus-finding and conflict resolution on editorial issues
dynamics of content reuse across projects and the impact of policies and
community norms on reuse privacy, security, and trust
collaborative content creation
innovative uses of Wikimedia projects' content and consumption patterns
as sensors for real-world events, culture, etc.
open-source research code, datasets, and tools to support research on
Wikimedia contents and communities
connections between Wikimedia projects and the Semantic Web
strategies for how to incorporate Wikimedia projects into media literacy
This year’s Wiki Workshop solicits extended abstracts (PDF format, maximum
2 pages, including references). Submissions that exceed the 2-page limit
will be automatically rejected. Authors may include 1 additional page with
figures and/or tables (including captions) only. Initial submissions
require names and affiliations of authors, 5 keywords, a title, abstract,
and a main text outlining the contribution, methods, findings, and impact
of the work, whichever is relevant. Submissions will be non-archival and as
a result may have already been published, under review, or ongoing
research. All submissions will be reviewed by multiple members of the Wiki
Workshop Program Committee. The names of the authors will be revealed to
the reviewers, whereas reviewers will remain anonymous to authors. Authors
of accepted abstracts will be invited to present their research in a
pre-recorded oral presentation with dedicated time for live Q&A on May 11,
2023. Accepted abstracts may be shared on the website prior to the event.
The template for formatting the submission as well as the submission link
to easychair will be made available by February 23.
Martin Gerlach (he/him) | Senior Research Scientist | Wikimedia Foundation
Wiki-research-l mailing list -- wiki-research-l(a)lists.wikimedia.org
To unsubscribe send an email to wiki-research-l-leave(a)lists.wikimedia.org