For all Hive users using stat1002/1004, you might have seen a deprecation
warning when you launch the hive client - that claims it's being replaced
with Beeline. The Beeline shell has always been available to use, but it
required supplying a database connection string every time, which was
pretty annoying. We now have a wrapper
setup to make this easier. The old Hive CLI will continue to exist, but we
encourage moving over to Beeline. You can use it by logging into the
stat1002/1004 boxes as usual, and launching `beeline`.
There is some documentation on this here:
If you run into any issues using this interface, please ping us on the
Analytics list or #wikimedia-analytics or file a bug on Phabricator
(If you are wondering stat1004 whaaat - there should be an announcement
coming up about it soon!)
Curious, what percentage of digital assistants (Alexa, Siri, Cortana,
Google) cite Wikipedia when a person asks a question?
Does the current Wikipedia mobile app support voice search?
Are there any reports on this? Thanks in advance!
Stella Yu | STELLARESULTS | 415 690 7827
"Chronicling heritage brands and legendary people."
We are excited to announce that the 5th annual Wiki Workshop  will
take place in Lyon on April 24, 2018 and as part of The Web Conference
2018 (a.k.a. WWW2018) .
You can access the call for papers at
http://wikiworkshop.org/2018/#call . Please submit your ongoing or
completed research related to Wikimedia projects to the workshop. Note
that 2018-01-28 is the submission deadline if you want your paper to
appear in the proceedings, and 2018-03-11 is for all other papers.
Following the past year's model, the workshop will have a set of
invited talks (Jon Kleinberg and Markus Kroetzsch have already
accepted our invitation  \o/), a poster session, and more.
Questions and comments are welcome. Otherwise, we're looking forward
to receiving your submissions and seeing you in Lyon in April. :)
Leila, on behalf of the organizers 
Senior Research Scientist
As many of you know already, this year the Web Conference
<https://www2018.thewebconf.org> will feature an alternate track on
Misinformation, and Fact Checking*, jointly organized by Kristina Lerman,
Takis Metaxas, and me.
We are happy to announce that the final program is up on the website:
Aside from the twelve accepted research presentations, we are particularly
happy to announce that we have assembled an exciting panel. See below for
more information about it.
We hope to see you in Lyon, and if you have any question feel free to reach
out at misinfochairs(a)www2018.thewebconf.org
Giovanni, Kristina, and Takis
*The effects of “Fake News” on Journalism and Democracy*
*Online propaganda and misinformation appeared along with the first search
engine in the mid-90’s and it became harder to detect in the last decade
with the development of social media applications. Yet, in the last few
years it spread widely in the form of the so-called “fake news”, falsehoods
online formatted and circulated in such a way that a reader might mistake
them for legitimate news articles. How big of a problem is it, how
technology and policy can help us address it, and what are the implications
for Journalism and Democracy?*
- Daniel Funke <https://www.poynter.org/person/dfunke> (Poynter
- Katherine Maher <https://meta.wikimedia.org/wiki/User:Katherine_(WMF)>
- P. Takis Metaxas <http://cs.wellesley.edu/~pmetaxas/> (Albright
Institute for Global Affairs, Wellesley College) – Moderator
- An Xiao Mina <https://about.me/anxiaostudio> (Credibility Coalition
- Soroush Vosoughi <http://soroush.mit.edu/> (MIT Media Lab)
Giovanni Luca Ciampaglia <glciampagl(a)gmail.com> ∙ Assistant Research
IU Network Science Institute <http://iuni.iu.edu/> ∙ glciampaglia.com
News 🕫 *WWW 2018* ∙ Alternate track on Journalism, Misinformation, and
We’re preparing for the March 2018 research newsletter and looking for contributors. Please take a look at https://etherpad.wikimedia.org/p/WRN201803 and add your name next to any paper you are interested in covering. Our target publication date is on April 1 UTC. If you can't make this deadline but would like to cover a particular paper in the subsequent issue, leave a note next to the paper's entry in the etherpad. As usual, short notes and one-paragraph reviews are most welcome.
Highlights from this month:
• A Brief History of Human Time: Exploring a database of 'notable people'
• A Comparison of the Historical Entries in Wikipedia and Baidu Baike
• A Hybrid Model for Quality Assessment of Wikipedia Articles
• Becoming an online editor: perceived roles and responsibilities of Wikipedia editors
• Capturing the influence of geopolitical ties from Wikipedia with reduced Google matrix
• Community Detection with Metadata in a Network of Biographies of Western Art Painters
• Generation of Multilingual Wikipedia Summaries from Wikidata for ArticlePlaceholders
• Is Catalonia an Independent Country? Tracking Implicit Biases in Crowdsourced Knowledge Graphs
• Learning to Generate Wikipedia Summaries for Underserved Languages from Wikidata
• Linking ImageNet WordNet Synsets with Wikidata
• Mining Cross-Cultural Differences of Named Entities: A Preliminary Study
• Modeling the Wikipedia to Understand the Dynamics of Long Disputes and Biased Articles
• Neural Wikipedian: Generating Textual Summaries from Knowledge Base Triples
• Semantic labeling for quantitative data using Wikidata
• Sentiments in Wikipedia Articles for Deletion Discussions
• The Pipeline of Online Participation Inequalities: The Case of Wikipedia Editing
• The rise and decline" in a population of peer production projects
• Towards a Question Answering System over the Semantic Web
• Using big data and network analysis to understand Wikipedia article quality
• Visualizing the Flow of Discourse with a Concept Ontology
Masssly, Tilman Bayer and Dario Taraborelli
 http://meta.wikimedia.org/ wiki/Research:Newsletter
I thank you for your answer. As a Programme Committee member of WikiIndaba 2018 and as the author of WikiResearch in Africa: Situation and Challenges also presented in the Research Showcase of WikiIndaba 2018, I was honoured to receive your report. I really greet the efforts of Wikimedia Foundation to raise WikiResearch in Africa and would like to contribute in this context. We can discuss about that if you want to. Concerning the role of Wikimedia Foundation concerning and as I already said after your presentation, I think that the matter is the lack of connection between LangCom and African language regulatory institutions. Another matter can be the difficulty of reaching LangCom. In fact, messages from communities to LangCom mailing list take days to be processed by moderators and then published. There is also a problem of contacting LangCom using Phabricator and Meta. Absolutely, such matters should be fixed. Finally, just for information concerning Wikimania proposal about using Wikidata in Medicine, I should inform you that I am Csisc who posted it in Wikidata talk page of Wikimania 2018.
De : Wiki-research-l <wiki-research-l-bounces(a)lists.wikimedia.org> de la part de wiki-research-l-request(a)lists.wikimedia.org <wiki-research-l-request(a)lists.wikimedia.org>
Envoyé : vendredi 23 mars 2018 13:00
À : wiki-research-l(a)lists.wikimedia.org
Objet : Wiki-research-l Digest, Vol 151, Issue 11
Send Wiki-research-l mailing list submissions to
To subscribe or unsubscribe via the World Wide Web, visit
or, via email, send a message with subject or body 'help' to
You can reach the person managing the list at
When replying, please edit your Subject line so it is more specific
than "Re: Contents of Wiki-research-l digest..."
1. trip report: Wiki Indaba 2018 [partial] (Leila Zia)
2. Re: trip report: Wiki Indaba 2018 [partial] (Gerard Meijssen)
Date: Thu, 22 Mar 2018 16:41:46 -0700
From: Leila Zia <leila(a)wikimedia.org>
To: Research into Wikimedia content and communities
Subject: [Wiki-research-l] trip report: Wiki Indaba 2018 [partial]
Content-Type: text/plain; charset="UTF-8"
Here is the report of the one session I attended in Wiki Indaba over the
Date: Fri, 23 Mar 2018 08:13:31 +0100
From: Gerard Meijssen <gerard.meijssen(a)gmail.com>
To: Research into Wikimedia content and communities
Subject: Re: [Wiki-research-l] trip report: Wiki Indaba 2018 [partial]
Content-Type: text/plain; charset="UTF-8"
I have read your comments on the WIki Indaba. Sad to hear that you could
not make it.
As a movement it is not our task to serve the "2000" languages that you
mention. It is our task to serve the languages that we support in our
existing Wikipedias. The difference is significant. When people aim to help
themselves, their culture, their language by investing their efforts in a
Wikipedia, we have a process that recognises this and that leads to the
start of a Wikipedia. Thanks to the Incubator, translatewiki.net we provide
a native interface in all our languages. There are strong arguments why we
should invest more in other languages like the top 25 languages minus
English and in the other languages. The easiest argument is that English is
less than 50% of our traffic.
Where you talk about subjects that people are likely to read, there are
many predictive models possible. The big issue in current approaches is
that they start with what we know from projects particularly the English
Wikipedia. The English Wikipedia is biased and consequently many subjects
that may be of a higher relevance in other languages or cultures will not
be suggested when English Wikipedia and its traffic is the yard stone to
measure by. Often there is more and better information in other Wikipedias.
Arguably thanks to Wikidata it becomes easier to find a more composite view
of the subjects people may be interested in.
Anyway, thank you for reporting on your virtual presence; you made a
difference in this way.
On 23 March 2018 at 00:41, Leila Zia <leila(a)wikimedia.org> wrote:
> Hi all,
> Here is the report of the one session I attended in Wiki Indaba over the
> past weekend:
> Wiki-research-l mailing list
Subject: Digest Footer
Wiki-research-l mailing list
End of Wiki-research-l Digest, Vol 151, Issue 11
Hi Research-l folks,
Is there a chart that shows the proportions of new users that register on
Wikimedia in association with individual campaigns like Wiki Loves
Monuments, chapter GLAM activities, education programs, etc?
I am particularly interested in knowing what percentage of new users are
likely to be unaffiliated with identifiable programs, and likely need to
learn how Wikimedia works using exclusively online resources and initially
without individualized help.
If this information is available for a variety of snapshots in time, and
for a variety of individual Wikimedia sites, that would be appreciated. If
this information is available with productivity and attrition information
for each group on each site, that would be even better.
( https://meta.wikimedia.org/wiki/User:Pine )