We’re preparing for the upcoming issue of the research newsletter (
https://meta.wikimedia.org/wiki/Research:Newsletter ) and looking for
contributors. If you are interested in reviewing or summarizing recently
published research for our audience of Wikimedians and academic
researchers, please take a look at
https://etherpad.wikimedia.org/p/WRN202310 and add your name next to any
paper you are interested in covering. This issue (for October 2023) is
scheduled for publication on November 5, 2023 20:00 UTC, texts should be in
a day before that. If you can't make this deadline but would like to cover
a particular paper in the subsequent issue, leave a note next to the
paper's entry. As usual, short notes and one-paragraph reviews are most
Alhaji Darajaati on behalf of the Newsletter team
[Moving research-wmf to Bcc.]
Dear Hanxuan Sun.
Thank you for reaching out.
*Some tips for increasing the chances of success for your project*
- *Reduce the chance of surprising existing Wikipedia volunteers. *For
- If your project involves recruiting existing Wikipedia editors or
changing content in a Wikipedia language, please make sure you
that to the relevant Wikipedia language community and engage in follow-up
conversations they may want to have with you. On English
Wikipedia, you can
do it at
. (Village pump is a place where many communities Wikipedia language
communities maintain for this type of conversation. You can find
languages' village pump pages by clicking on the languages menu on the
top-right side of https://en.wikipedia.org/wiki/Wikipedia:Village_pump
- Create a research page for your project on MetaWiki
https://meta.wikimedia.org/wiki/Research:New_project and link it in
your communications. This is the place where others can learn more about
your research. Sample projects at
https://meta.wikimedia.org/wiki/Research:Index . (Reference your IRB
from this research page if you can.)
- *Understand the context. *We can give you tips to improve your work,
however, the relevant Wikimedia project (Wikipedia language community in
your case), is the community who you'll need to primarily work with.
- *Survey privacy and data retention. *With Wikimedians, sometimes less
is more. :) I highly recommend you think hard about what data you actually
need and how long you will keep it for what reason. Keeping sensitive data
in perpetuity can raise alarms b/c privacy is something many Wikipedians
- *Survey questions. *There are at least a few folks on this list that
have expertise on this front. They may choose to leave feedback for you.
Thanks for being proactive and asking for feedback. :) I had a quick look
at the first few pages. One question that immediately caught my attention
was the question about gender where you ask about the gender and you offer
options about sex. That question needs a fix, please. You can find a sample
of survey questions (including gender related ones at
). Having seen this one example you can improve, I highly recommend that
you seek specific input into your survey questions from a survey specialist
before running the survey to make sure the survey questions can help you
with the questions you want to answer as part of the research and that they
are as close as possible to the latest best practices in survey design.
Head of Research
On Wed, Oct 18, 2023 at 11:51 AM Hanxuan Sun <hanxuan.sun(a)unsw.edu.au>
> Dear Wikipedia organization members,
> Researchers at UNSW are conducting a project about exploring the reasons
> behind translators making changes to their translations on Wikipedia, a
> popular online encyclopedia which uses a ‘crowdsourcing’ approach,
> attracting volunteer translators to translate its content. The research
> will investigate the factors that influence translators’ decisions to
> revise existing translations, such as the quality of translation (e.g.,
> from machine translation), personal beliefs, and discussions with peers in
> online communities, etc.
> The research study is looking recruit people who meet the following
> 1. 18 years of age or older;
> 2. Live in Australia;
> 3. Proficient Chinese and English bilinguals;
> 4. Active Wikipedia online volunteers engaged in revision.
> Participants will be asked to complete the following research activities
> if they agree to participate:
> - Online surveys with 34 questions that will take approximately 20 to
> 25 minutes to complete; and/or
> - Followed-up one-on-one interviews via Zoom, which will need around
> 30 minutes; and/or
> - Observational study for an active group; and/or
> - Focus group discussion via zoom, which will take around 2 hours.
> - A full description of all research activities, including any risks,
> harms or discomforts that you may experience while participating in this
> research is included in the attached Participant Information Statement and
> Consent Form.
> Please contact the following person via email or phone to register your
> interest in taking part in the research:
> Hanxuan Sun
> Student Investigator
> If you have questions about the research and would like to contact the
> Chief Investigator, please contact the following person:
> *Chief Investigator *
> Stephen Doherty
> Chief Investigator
> (02)9385 1681
> This project is approved by the ethics committee in UNSW, which is
> attached. The Participant’s consent form is attached in the cover page of
> the surveys. The link of surveys are: English version:
> https://unsw.au1.qualtrics.com/jfe/form/SV_3fNVlLeMgM3BNzg ; Chinses
> version: https://unsw.au1.qualtrics.com/jfe/form/SV_0AjTai9lMOf48FE .
> Would you please check the content of the surveys at your most convenience?
> Thank you so much! If you have any questions, feel free to contact me.
> Hanxuan Sun.
> Research-wmf mailing list -- research-wmf(a)lists.wikimedia.org
> To unsubscribe send an email to research-wmf-leave(a)lists.wikimedia.org
The next Research Showcase, focused on *Data Privacy*, will be
live-streamed on Wednesday, October 18, at 9:30 AM PST / 16:30 UTC. Find
your local time here <https://zonestamp.toolforge.org/1697646641>.
YouTube stream: https://www.youtube.com/watch?v=ntgRsMaDlsw. As usual, you
can join the conversation in the YouTube chat as soon as the showcase goes
This month's presentations:
Wikipedia Reader Navigation: When Synthetic Data Is EnoughBy *Akhil Arora,
EPFL*Every day millions of people read Wikipedia. When navigating the vast
space of available topics using hyperlinks, readers describe trajectories
on the article network. Understanding these navigation patterns is crucial
to better serve readers’ needs and address structural biases and knowledge
gaps. However, systematic studies of navigation on Wikipedia are hindered
by a lack of publicly available data due to the commitment to protect
readers' privacy by not storing or sharing potentially sensitive data. In
this paper, we ask: How well can Wikipedia readers' navigation be
approximated by using publicly available resources, most notably the
Wikipedia clickstream data <https://wikinav.toolforge.org/>? We
systematically quantify the differences between real navigation sequences
and synthetic sequences generated from the clickstream data, in 6 analyses
across 8 Wikipedia language versions. Overall, we find that the differences
between real and synthetic sequences are statistically significant, but
with small effect sizes, often well below 10%. This constitutes
quantitative evidence for the utility of the Wikipedia clickstream data as
a public resource: clickstream data can closely capture reader navigation
on Wikipedia and provides a sufficient approximation for most practical
downstream applications relying on reader data. More broadly, this study
provides an example for how clickstream-like data can generally enable
research on user navigation on online platforms while protecting users’
How to tell the world about data you cannot show them: Differential privacy
at the Wikimedia FoundationBy *Hal Triedman, Wikimedia Foundation*The
Wikimedia Foundation (WMF), by virtue of its centrality on the internet,
collects lots of data about platform activities. Some of that data is made
public (e.g. global daily pageviews); other data types are not shared (or
are pseudonymized prior to sharing), largely due to privacy concerns.
Differential privacy is a statistical definition of privacy that has gained
prominence in academia, but is still an emerging technology in industry. In
this talk, I share the story of how we put differential privacy into
production at the WMF, through looking at the case study of geolocated
daily pageview counts.
You can also watch our past research showcases here:
Lead Research Community Officer
Wikimedia Foundation <https://wikimediafoundation.org/>
Lead Research Community Officer
Wikimedia Foundation <https://wikimediafoundation.org/>
Hello (Semantic) MediaWiki maintainers, software developers, consultants, researchers!
The SMWCon 2023 (https://www.semantic-mediawiki.org/wiki/SMWCon_Fall_2023) will be held on location in Paderborn, Germany (and online). On three days there will be talks, tutorials and hackathons.
Registration is open on Eventbrite<https://www.eventbrite.com/e/smwcon-fall-2023-tickets-719554987337>.
Go there and take advantage of the early bird rates!
Call for Contributions
This conference addressed everybody interested in wikis and open knowledge, especially in Semantic MediaWiki, e.g. users, developers, consultants, business or government representatives, and researchers.
This conference aimed to:
* inspire/onboard new users,
* inform on where and how MediaWiki is used,
* convey and consolidate best practices,
* initiate/foster/integrate application and development and
* strengthen the community of stakeholders and its service portfolio.
Learn how to "do" MediaWiki in order to assume your responsibilities regarding your organization's knowledge management.
Your experience is valuable for all of us! So please share and propose a talk, tutorial or other contribution.
Go to the Conference Page (https://www.semantic-mediawiki.org/wiki/SMWCon_Fall_2023)
and hit the 'Propose a talk here' button.
Please propose a contribution if you plan to have one, even if you don't have the details yet. For us it is important to know what we can expect.
We look forward to your contribution!
Bernhard and Tobias
on behalf of https://mwstake.org/ - the MediaWiki Stakeholders' Group