Hi all,
The next Research Showcase, with the theme of Images on Wikipedia, will be
live-streamed Wednesday, April 19, at 16:30 UTC. Find your local time here
<https://zonestamp.toolforge.org/1681921857>.
YouTube stream: https://www.youtube.com/watch?v=vW0waU-QArU
You can join the conversation on IRC at #wikimedia-research or on the
YouTube chat.
This month's presentations:
A large scale study of reader interactions with images on WikipediaBy *Daniele
Rama, University of Turin*Wikipedia is the largest source of free
encyclopedic knowledge and one of the most visited sites on the Web. To
increase reader understanding of the article, Wikipedia editors add images
within the text of the article’s body. However, despite their widespread
usage on web platforms and the huge volume of visual content on Wikipedia,
little is known about the importance of images in the context of free
knowledge environments. To bridge this gap, we collect data about English
Wikipedia reader interactions with images during one month and perform the
first large-scale analysis of how interactions with images happen on
Wikipedia. First, we quantify the overall engagement with images, finding
that one in 29 pageviews results in a click on at least one image, one
order of magnitude higher than interactions with other types of article
content. Second, we study what factors associate with image engagement and
observe that clicks on images occur more often in shorter articles and
articles about visual arts or transports and biographies of less well-known
people. Third, we look at interactions with Wikipedia article previews and
find that images help support reader information need when navigating
through the site, especially for more popular pages. The findings in this
study deepen our understanding of the role of images for free knowledge and
provide a guide for Wikipedia editors and web user communities to enrich
the world’s largest source of encyclopedic knowledge.
- Paperː
https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-021-0…
Visual gender biases in Wikipediaː A systematic evaluation across the ten
most spoken languagesBy *Pablo Beytia, Catholic University of Chile*The
existing research suggests a significant gender gap in Wikipedia
biographical articles, with a minimal representation of women and gender
asymmetries in the textual content. However, the visual aspects of this gap
(e.g., image volume and quality) have received little attention. This study
examined asymmetries between women's and men's biographies, exploring
written and visual content across the ten most widely spoken languages. The
cross-lingual analysis reveals that (1) the most salient male biases appear
when editors select which personalities should have a Wikipedia page, (2)
the trends in written and visual content are dissimilar, (3) male
biographies tend to have more images across languages, and (4) female
biographies have better visual quality on average. The open database of
this study provides eight indicators of gender asymmetries in ten
occupational domains and ten languages. That information allows for a
granular view of gender biases, as well as exploring more macroscopic
phenomena, such as the similarity between Wikipedia versions according to
their gender bias structures.
- Papersː
Beytía, P., Agarwal, P., Redi, M., & Singh, V. K. (2022). Visual Gender
Biases in Wikipedia: A Systematic Evaluation across the Ten Most Spoken
Languages. Proceedings of the International AAAI Conference on Web and
Social Media, 16(1), 43-54. https://doi.org/10.1609/icwsm.v16i1.19271https://ojs.aaai.org/index.php/ICWSM/article/view/19271Beytía, P. & Wagner,
C. (2022). Visibility layers: a framework for systematizing the gender gap
in Wikipedia content. Internet Policy Review, 11(1).
https://doi.org/10.14763/2022.1.1621https://policyreview.info/articles/analysis/visibility-layers-framework-sys…
You can watch our past Research Showcases here:
https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase
Hope you can join us!
Warm regards,
Emily
--
Emily Lescak (she / her)
Senior Research Community Officer
The Wikimedia Foundation
Hello,
Apologies for the short notice. The SRE team will be carrying out an
upgrade of the switches in eqiad row D later today
(https://phabricator.wikimedia.org/T333377) at approximately 14:00 UTC.
The network outage to this row resulting from this work is expected to
be around 30 minutes, all being well.
In support of this work, the Data Engineering team will be putting HDFS
file system into safe mode at approximately 13:30 today, which means
that write operations to the cluster will be refused.
Jobs sent to the YARN cluster will also be refused from around the same
time, so please try to plan any work that you may have for the cluster
to avoid this maintenance window.
Read-only access to Hive, Presto, Superset, Turnilo, should continue to
function normally throughout the maintenance window.
Finally, two of the stats servers (stat1005 and stat1006) will be
unavailable, so please save any work that you may have on these servers
before the loss of connectivity.
Please do reach out via any of the normal channels (email:
analytics(a)lists.wikimedia.org , IRC: #wikimedia-analytics , Slack
#data-engineering ) if you have any queries or concerns.
Kind regards,
Ben
--
*Ben Tullis*(he/him)
Senior Site Reliability Engineer
Wikimedia Foundation <https://wikimediafoundation.org/>