Hi all,
Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours on 2021-06-01 at 16:00-17:00 UTC (9am PT/6pm CEST).
To participate, join the video-call via this link [2]. There is no set
agenda - feel free to add your item to the list of topics in the etherpad
[3] (You can do this after you join the meeting, too.), otherwise you are
welcome to also just hang out. More detailed information (e.g. about how to
attend) can be found here [4].
Through these office hours, we aim to make ourselves more available to
answer some of the research related questions that you as Wikimedia
volunteer editors, organizers, affiliates, staff, and researchers face in
your projects and initiatives. Some example cases we hope to be able to
support you in:
-
You have a specific research related question that you suspect you
should be able to answer with the publicly available data and you don’t
know how to find an answer for it, or you just need some more help with it.
For example, how can I compute the ratio of anonymous to registered editors
in my wiki?
-
You run into repetitive or very manual work as part of your Wikimedia
contributions and you wish to find out if there are ways to use machines to
improve your workflows. These types of conversations can sometimes be
harder to find an answer for during an office hour, however, discussing
them can help us understand your challenges better and we may find ways to
work with each other to support you in addressing it in the future.
-
You want to learn what the Research team at the Wikimedia Foundation
does and how we can potentially support you. Specifically for affiliates:
if you are interested in building relationships with the academic
institutions in your country, we would love to talk with you and learn
more. We have a series of programs that aim to expand the network of
Wikimedia researchers globally and we would love to collaborate with those
of you interested more closely in this space.
-
You want to talk with us about one of our existing programs [5].
Hope to see many of you,
Martin on behalf of the WMF Research Team
[1] https://research.wikimedia.org/team.html
[2] https://meet.jit.si/WMF-Research-Office-Hours
[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours
[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours
[5] https://research.wikimedia.org/projects.html
--
Martin Gerlach
Research Scientist
Wikimedia Foundation
Hello all,
In this month’s showcase, we have two presentations on the importance and
value of Wikipedia outside of Wikipedia. In the first talk, Nick Vincent
will present results on the importance of Wikipedia for search engines in
which they looked how often and where Wikipedia links appeared in search
results of Google, Bing, DuckDuckGo, etc. In the second talk, Tiziano
Piccardi will present a recent study on the value of Wikipedia’s links to
external websites by quantifying the amount of traffic generated by those
links and analyzing for which types of articles Wikipedia acts as a
stepping stone to the intended destination.
Time/date: May 19, 16:30 UTC (9:30am PT/ 12:30pm ET/ 18:30pm CET)
Youtube-link: https://www.youtube.com/watch?v=VoX5rFNzkXs
Talk 1
Speaker: Nick Vincent (Northwestern University, USA)
Title: The Importance of Wikipedia to Search Engines and Other Systems
Abstract: A growing body of work has highlighted the important role that
Wikipedia’s volunteer-created content plays in helping search engines
achieve their core goal of addressing the information needs of hundreds of
millions of people. In this talk, I will discuss a recent study looking at
how often, and where, Wikipedia links appear in search engine results. In
this study, we found that Wikipedia links appeared prominently and
frequently in Google, Bing, and DuckDuckGo results, though less often for
searches from a mobile device. I will connect this study to past work
looking at the value of Wikipedia links to other online platforms, and to
ongoing discussions around Wikipedia's value as a training source for
modern AI.
Related paper:
-
A Deeper Investigation of the Importance of Wikipedia Links to Search
Engine Results. To Appear in CSCW 2021.
https://nickmvincent.com/static/wikiserp_cscw.pdf (pdf)
Talk 2
Speaker: Tiziano Piccardi (EPFL, Switzerland)
Title: On the Value of Wikipedia as a Gateway to the Web
Abstract: By linking to external websites, Wikipedia can act as a gateway
to the Web. However, little is known about the amount of traffic generated
by Wikipedia's external links. We fill this gap in a detailed analysis of
usage logs gathered from Wikipedia users' client devices. We discovered
that in one month, English Wikipedia generated 43M clicks to external
websites, with the highest click-through rate on the official links listed
in the infoboxes. Our analysis highlights that the articles about
businesses, educational institutions, and websites show the highest
engagement, and for some content, Wikipedia act as a stepping stone to the
intended destination. We conclude our analysis by quantifying the
hypothetical economic value of the clicks received by external websites. We
estimate that the respective website owners would need to pay a total of
$7--13 million per month to obtain the same volume of traffic via sponsored
search. These findings shed light on Wikipedia's role not only as an
important source of information but also as a high-traffic gateway to the
broader Web ecosystem.
Related paper:
-
On the Value of Wikipedia as a Gateway to the Web. WWW 2021.
https://arxiv.org/pdf/2102.07385 (pdf)
--
Janna Layton (she/her)
Administrative Associate - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>
<tl;dr>: Your feedback as a technically-minded Wikimedian is welcome on
https://www.mediawiki.org/wiki/Developer_Advocacy/Developer_Portal/Content_…
Hi everyone,
last year[1] the Developer Advocacy team started to work on a single,
central entry point for developers and tech-minded people for
Wikimedia's technical documentation[2].
A central entry point ("developer portal") would cover common technical
use cases to allow existing and future technical contributors and
developers to find the information they need.
Each use case links to its most relevant documentation (i.e. to pages
on wikitech or mediawiki.org).
This is part of a larger initiative to implement an organization
strategy for key technical documents: Understand challenges about
finding and maintaining docs, identify key docs, and investigate ways
to improve our workflows around documentation.
So far we:
* Researched and reviewed existing documentation venues and pages
* Reviewed developer/documentation portals in the broader industry
* Interviewed several engineering teams around technical documentation
workflows, audiences, and key technical docs (the key themes from
these conversations are available, see the link above)
* Created an initial draft for the structure and content of the single
entry point
Now we would like to improve this initial content draft with your help.
1) Please take a look at the initial draft at
https://www.mediawiki.org/wiki/Developer_Advocacy/Developer_Portal/Content_…
2) Then, please help improve it by sharing your thoughts and feedback
until *May 25th* at
https://www.mediawiki.org/wiki/Developer_Advocacy/Developer_Portal/Content_…
Note that this is only a draft how to structure content.
It is not a design or layout proposal and it is not an implementation.
For additional future work, see also the Phabricator workboard[3].
Next steps include:
* A session at the remote Hackathon (May 22-23; [4])
* Incorporate content improvements, based on your feedback
* Check the documents linked from the single entry point proposal for
accuracy
* Investigate requirements for the technical implementation
* Investigate improvements of processes around technical documentation
(structure, locations, navigation, stewardship, etc).
If you want to learn more about the project, please see
https://www.mediawiki.org/wiki/Developer_Advocacy/Developer_Portal
Thanks to everybody who has provided their valuable input to get to
this stage, and thanks in advance to everyone who will!
Cheers,
andre
[1] https://lists.wikimedia.org/pipermail/wikitech-l/2020-August/093773.html
[2] https://www.mediawiki.org/wiki/Developer_Advocacy/Developer_Portal
[3] https://phabricator.wikimedia.org/tag/wikimedia-developer-portal/
[4] https://www.mediawiki.org/wiki/Wikimedia_Hackathon_2021
--
Andre Klapper (he/him) | Bugwrangler / Developer Advocate
https://blogs.gnome.org/aklapper/
Hi all,
tl;dr: we'd like to remove the rev_is_revert field from the
mediawiki.revision-create stream to solve a missing event problem.
For years now, we've known that the mediawiki.revision-create stream
<https://stream.wikimedia.org/?doc#/streams/get_v2_stream_mediawiki_revision…>
has
been missing many real revision create events
<https://phabricator.wikimedia.org/T215001> when compared with
MediaWiki's MySQL databases. This makes the stream almost useless for
those who want to use it as a notification mechanism about all MediaWiki
page changes.
The reason for the large number of missing events is because the code that
emits the event is subscribing to the wrong MediaWiki hook. This patch
<https://gerrit.wikimedia.org/r/c/mediawiki/extensions/EventBus/+/679353/> will
fix this, however the correct hook does not give us the information we need
to set the rev_is_revert and rev_revert_details fields. This field is
relatively new (only added last August 2020
<https://github.com/wikimedia/schemas-event-primary/commit/53b6480cb1045316c…>).
We think that including the missing revisions is more important than
capturing the revert information, which really only captures whether or not
a user used the MediaWiki UI to issue a revert.
We plan on moving forward with this, but would like feedback before we do.
If you have objections, or other ideas on how we can provide this data
(like maybe including it in mediawiki/revision-tags-change
<https://schema.wikimedia.org/repositories//primary/jsonschema/mediawiki/rev…>
and
making that public?), let us know by replying to this email or in this
ticket: https://phabricator.wikimedia.org/T215001
Thanks!
-Andrew Otto
SRE, Data Engineering, WMF
Hi all,
Join the Research Team at the Wikimedia Foundation [1] for their monthly
Office hours on 2021-05-04 at 16:00-17:00 UTC (9am PT/6pm CET).
To participate, join the video-call via this link [2]. There is no set
agenda - feel free to add your item to the list of topics in the etherpad
[3] (You can do this after you join the meeting, too.), otherwise you are
welcome to also just hang out. More detailed information (e.g. about how to
attend) can be found here [4].
Through these office hours, we aim to make ourselves more available to
answer some of the research related questions that you as Wikimedia
volunteer editors, organizers, affiliates, staff, and researchers face in
your projects and initiatives. Some example cases we hope to be able to
support you in:
-
You have a specific research related question that you suspect you
should be able to answer with the publicly available data and you don’t
know how to find an answer for it, or you just need some more help with it.
For example, how can I compute the ratio of anonymous to registered editors
in my wiki?
-
You run into repetitive or very manual work as part of your Wikimedia
contributions and you wish to find out if there are ways to use machines to
improve your workflows. These types of conversations can sometimes be
harder to find an answer for during an office hour, however, discussing
them can help us understand your challenges better and we may find ways to
work with each other to support you in addressing it in the future.
-
You want to learn what the Research team at the Wikimedia Foundation
does and how we can potentially support you. Specifically for affiliates:
if you are interested in building relationships with the academic
institutions in your country, we would love to talk with you and learn
more. We have a series of programs that aim to expand the network of
Wikimedia researchers globally and we would love to collaborate with those
of you interested more closely in this space.
-
You want to talk with us about one of our existing programs [5].
Hope to see many of you,
Martin on behalf of the WMF Research Team
[1] https://research.wikimedia.org/team.html
[2] https://meet.jit.si/WMF-Research-Office-Hours
[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours
[4] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours
[5] https://research.wikimedia.org/projects.html
--
Martin Gerlach
Research Scientist
Wikimedia Foundation