The WMF Research team has published a new pageview report of inbound
traffic coming from Facebook, Twitter, YouTube, and Reddit.[1]
The report contains a list of all articles that received at least 500 views
from one or more of these platforms (i.e. someone clicked a link on Twitter
that sent them directly to a Wikipedia article). The report is available
on-wiki and will be updated daily at around 14:00 UTC with traffic counts
from the previous calendar day.
We believe this report provides editors with a valuable new information
source. Daily inbound social media traffic stats can help editors monitor
edits to articles that are going viral on social media sites and/or are
being linked to by the social media platform itself in order to fact-check
disinformation and other controversial content[2][3].
The social media traffic report also contains additional public article
metadata that may be useful in the context of monitoring articles that are
receiving unexpected attention from social media sites, such as...
- the total number of pageviews (from all sources) that article received
in the same period of time
- the number of pageviews the article received from the same platform
(e.g. Facebook) the previous day (two days ago)
- the number of editors who have the page on their watchlist
- the number of editors who have watchlisted the page AND recently
visited it
We want your feedback! We have some ideas of our own for how to improve the
report, but we want to hear yours! If you have feature suggestions, please
add them here.[4] We intend to maintain this daily report for at least the
next two months. If we receive feedback that the report is useful, we are
considering making it available indefinitely.
If you have other questions about the report, please first check out our
(still growing) FAQ [5]. All questions, comments, concerns, ideas, etc. are
welcome on the project talkpage on Meta.[4]
1. https://en.wikipedia.org/wiki/User:HostBot/Social_media_traffic_report
2.
https://www.engadget.com/2018/03/15/wikipedia-unaware-would-be-youtube-fact…
3.
https://mashable.com/2017/10/05/facebook-wikipedia-context-articles-news-fe…
4.
https://meta.wikimedia.org/wiki/Research_talk:Social_media_traffic_report_p…
5.
https://meta.wikimedia.org/wiki/Research:Social_media_traffic_report_pilot/…
Cheers,
Jonathan
--
Jonathan T. Morgan
Senior Design Researcher
Wikimedia Foundation
User:Jmorgan (WMF) <https://meta.wikimedia.org/wiki/User:Jmorgan_(WMF)>
(Uses He/Him)
*Please note that I do not expect a response from you on evenings or
weekends*
Hi everybody,
I recently learned from Superset upstream that the way that we configure
Druid datasources is considered "legacy", and in the future support will
likely decrease over time. They suggested using Druid SQL capabilities
configuring it as Database, and mapping every Druid datasource as a table.
What does this mean in practice? Several things:
1) We should stop creating Druid datasources (Sources -> Druid Datasources)
in favor of Druid tables. I added some info in
https://wikitech.wikimedia.org/wiki/Analytics/Systems/Superset#Druid_dataso….
The nice thing is that in this way we'll not be required to specify all the
dimensions/attributes.
2) We should test charts with Druid tables and see if we find bugs. The can
come from our Druid version (0.12.3, last upstream is 0.18, we'll upgrade
asap but requires time) or from Superset itself.
3) Migrate all charts using Druid Datasources to Druid tables (I'll follow
up with upstream to see if there is a way to script this and avoid a manual
fix for every chart).
The last point is a big request I know, but it is better if we start doing
it now rather than later in my opinion. I have recently started to test
Superset release candidates when upstream releases them, but I wasn't able
to stop the release of 0.36.0 with a bug report (
https://github.com/apache/incubator-superset/issues/9468) since the code
was considered "legacy" so not enough to stop a release. As you can
imagine, keeping up with Superset becomes a little bit more difficult and
I'd prefer to follow best (supported) practices from upstream where
possible :)
There is a task opened for this: https://phabricator.wikimedia.org/T249681.
If somebody could help me testing Druid tables it would be really great!
Thanks in advance,
Luca
Hi all,
join us for our monthly Analytics/Research Office hours on 2020-04-29 at
18.00-19.00 (UTC). Bring all your research questions and ideas to discuss
projects, data, analysis, etc… To participate, please join the IRC channel:
#wikimedia-research [1]. More detailed information can be found here [2] or
on the etherpad [3] if you would like to add items to agenda or check notes
from previous meetings.
Note that for this edition the timeslot has slightly changed in order to
avoid conflicts with other events (i.e. moved to one hour later in the day
and one week later in the month than originally announced).
Best,
Martin
[1] irc://chat.freenode.net:6667/wikimedia-research
[2] https://www.mediawiki.org/wiki/Wikimedia_Research/Office_hours
[3] https://etherpad.wikimedia.org/p/Research-Analytics-Office-hours
--
Martin Gerlach
Research Scientist
Wikimedia Foundation
Hi everyone,
We are delighted to announce that Wiki Workshop 2020 will be held in
Taipei on April 20 or 21, 2020 (the date to be finalized soon) and as
part of the Web Conference 2020 [1]. In the past years, Wiki Workshop
has traveled to Oxford, Montreal, Cologne, Perth, Lyon, and San
Francisco.
You can read more about the call for papers and the workshops at
http://wikiworkshop.org/2020/#call. Please note that the deadline for
the submissions to be considered for proceedings is January 17. All
other submissions should be received by February 21.
If you have questions about the workshop, please let us know on this
list or at wikiworkshop(a)googlegroups.com.
Looking forward to seeing you in Taipei.
Best,
Miriam Redi, Wikimedia Foundation
Bob West, EPFL
Leila Zia, Wikimedia Foundation
[1] https://www2020.thewebconf.org/
Hi everybody,
I recently learned from Superset Upstream that the suggested way to query
Druid is using the SQLAlchemy connector, not the Druid one (that is
"deprecated"). I digged a little bit more about this and I found the
following:
- Druid Analytics supports SQL querying on the brokers, so it was a matter
of adding a new "database" called "Druid Analytics SQL" to Superset.
- "Druid Analytics SQL" can be used for data querying and exploration in
the SQL Lab (feel free to test it!).
- Using SQLAlchemy should be more quick/convenient since we wouldn't need
to define mapping for Druid datasources manually (need to better
investigate this but I am reasonably sure).
Moving all the charts already in Superset to the new database/datasource is
probably a ton of work, and we are not really in a hurry (the "deprecated"
above means that they still support it but a bug to that code will be
considered not a blocker), but what I'd ask it to start using it and see if
there are pros/cons in your daily workflows.
I created a task to report thoughts/suggestions/bugs/etc.. to avoid
spamming too many people: https://phabricator.wikimedia.org/T249681
Thanks!
Luca