The WMF Research team has published a new pageview report of inbound
traffic coming from Facebook, Twitter, YouTube, and Reddit.
The report contains a list of all articles that received at least 500 views
from one or more of these platforms (i.e. someone clicked a link on Twitter
that sent them directly to a Wikipedia article). The report is available
on-wiki and will be updated daily at around 14:00 UTC with traffic counts
from the previous calendar day.
We believe this report provides editors with a valuable new information
source. Daily inbound social media traffic stats can help editors monitor
edits to articles that are going viral on social media sites and/or are
being linked to by the social media platform itself in order to fact-check
disinformation and other controversial content.
The social media traffic report also contains additional public article
metadata that may be useful in the context of monitoring articles that are
receiving unexpected attention from social media sites, such as...
- the total number of pageviews (from all sources) that article received
in the same period of time
- the number of pageviews the article received from the same platform
(e.g. Facebook) the previous day (two days ago)
- the number of editors who have the page on their watchlist
- the number of editors who have watchlisted the page AND recently
We want your feedback! We have some ideas of our own for how to improve the
report, but we want to hear yours! If you have feature suggestions, please
add them here. We intend to maintain this daily report for at least the
next two months. If we receive feedback that the report is useful, we are
considering making it available indefinitely.
If you have other questions about the report, please first check out our
(still growing) FAQ . All questions, comments, concerns, ideas, etc. are
welcome on the project talkpage on Meta.
Jonathan T. Morgan
Senior Design Researcher
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
*Please note that I do not expect a response from you on evenings or
We are delighted to announce that Wiki Workshop 2020 will be held in
Taipei on April 20 or 21, 2020 (the date to be finalized soon) and as
part of the Web Conference 2020 . In the past years, Wiki Workshop
has traveled to Oxford, Montreal, Cologne, Perth, Lyon, and San
You can read more about the call for papers and the workshops at
http://wikiworkshop.org/2020/#call. Please note that the deadline for
the submissions to be considered for proceedings is January 17. All
other submissions should be received by February 21.
If you have questions about the workshop, please let us know on this
list or at wikiworkshop(a)googlegroups.com.
Looking forward to seeing you in Taipei.
Miriam Redi, Wikimedia Foundation
Bob West, EPFL
Leila Zia, Wikimedia Foundation
Happy to jump in as a 2nd mentor.
I replied to the ticket on phabricator , we can discuss in more detail
there or on irc.
We just launched [[WM:Techblog]] <https://techblog.wikimedia.org/>, a new
place to share stories from the Wikimedia technical community with the
We created this venue to provide a central place for people to share
stories about the technical work that they do— like:
How to run a top ten website that is all Open Source?
How do we make our codebase more modular and future-proof?
How do we come together to work on technical projects?
How to develop a product for many different languages?
How do new tools and bots help run Wikimedia projects?
What can we learn from data science and research?
And, much more…
Check out these posts for examples:
Computational knowledge: Wikidata, Wikidata query Service, and women who
By Trey Jones
Parsoid in PHP, or There and Back Again
By S.Subramanya Sastry and C.Scott Ananian
Wikimedia projects have many intersections with the larger Open Source
community. It’s our hope that the technical blog will create more
visibility and conversations and provide information to a wider audience of
people who are interested in our work.
If you are interested in writing a blog post, read the editorial guidelines
to learn about what would make a good story and how to get published! We
manage the process of reviewing and publishing posts here:
Thanks to everyone who contributed to making the blog possible. There were
For the WM Techblog team,
Sarah R. Rodlund
Technical Writer, Developer Advocacy
Hello analytics folks!
My name is Abel Serrano Juste (alias Quasipodo in Wikimedia, akronix in
I'm considering to do GSoC with WMF this summer and I'd like it to do it
with the analytics team as my recent experience is mainly in this topic.
For instance, I propose to implement this one:
https://phabricator.wikimedia.org/T195033 as I'm already very familiar
with the topic (I'm the presentor of the talk which, eventually, opened
that ticket). However I am open to listen any other ideas / suggestions.
About me: I'm currently ending my master in Data Science and I should be
done by the middle of June. I have previously attended to a couple of
wikimedia hackathons and I'm familiar with mediawiki analysis and data
visualization (I'm the developer of WikiChron: http://wikichron.science/
). I'm quite fluent in Python. You can have a look also to my github
profile: https://github.com/Akronix/. <https://github.com/Akronix/>
Furthermore, I have been working for two years in the academia doing
research on wikis (smaller than wikipedia ones) and published some
articles on the matter: https://www.akronix.es/publications.html
Here are some information that GSoC mentors need to know, in case you
are not familiar with it:
* GSoC mentors are typically required to commit 4-5 hours per week,
* GSoC mentor responsibilities
includes 'Selection process tips' which might help answer your
previous question about likelihood of selection)
Best wishes and take care,
Abel Serrano Juste.
join us for our monthly Analytics/Research Office hours on 2020-03-25 at
17.00-18.00 (UTC). Bring all your research questions and ideas to discuss
projects, data, analysis, etc…
To participate, please join the IRC channel: #wikimedia-research .
More detailed information can be found here  or on the etherpad  if
you would like to add items to agenda or check notes from previous meetings.
The next Research Showcase will be live-streamed on Wednesday, March 18, at
9:30 AM PDT/16:30 UTC. We’ll have a presentation on topic modeling by
Jordan Boyd-Graber. A question-and-answer session will follow.
YouTube stream: https://www.youtube.com/watch?v=fiD9QTHNVVM
As usual, you can join the conversation on IRC at #wikimedia-research. You
can also watch our past research showcases here:
This month's presentation:
Big Data Analysis with Topic Models: Evaluation, Interaction, and
By: Jordan Boyd-Graber, University of Maryland
A common information need is to understand large, unstructured datasets:
millions of e-mails during e-discovery, a decade worth of science
correspondence, or a day's tweets. In the last decade, topic models have
become a common tool for navigating such datasets even across languages.
This talk investigates the foundational research that allows successful
tools for these data exploration tasks: how to know when you have an
effective model of the dataset; how to correct bad models; how to measure
topic model effectiveness; and how to detect framing and spin using these
techniques. After introducing topic models, I argue why traditional
measures of topic model quality---borrowed from machine learning---are
inconsistent with how topic models are actually used. In response, I
describe interactive topic modeling, a technique that enables users to
impart their insights and preferences to models in a principled,
interactive way. I will then address measuring topic model effectiveness in
Overview of topic models:
Topic model evaluation: http://umiacs.umd.edu/~jbg//docs/nips2009-rtl.pdf
Interactive topic modeling:
Topic Models for Categorization:
Janna Layton (she, her)
Administrative Assistant - Product & Technology
Wikimedia Foundation <https://wikimediafoundation.org/>