Hi all,
For all Hive users using stat1002/1004, you might have seen a deprecation
warning when you launch the hive client - that claims it's being replaced
with Beeline. The Beeline shell has always been available to use, but it
required supplying a database connection string every time, which was
pretty annoying. We now have a wrapper
<https://github.com/wikimedia/operations-puppet/blob/production/modules/role…>
script
setup to make this easier. The old Hive CLI will continue to exist, but we
encourage moving over to Beeline. You can use it by logging into the
stat1002/1004 boxes as usual, and launching `beeline`.
There is some documentation on this here:
https://wikitech.wikimedia.org/wiki/Analytics/Cluster/Beeline.
If you run into any issues using this interface, please ping us on the
Analytics list or #wikimedia-analytics or file a bug on Phabricator
<http://phabricator.wikimedia.org/tag/analytics>.
(If you are wondering stat1004 whaaat - there should be an announcement
coming up about it soon!)
Best,
--Madhu :)
Hoi,
In the thirties and forties of the 20th Century the NSB was the Dutch Nazi
organisation. It produced a lot of propaganda material. This material is
now available on Commons and it is available for study and illustration.
Thanks,
GerardM
http://td35.tripolis.com/public/r/CIeGfz2qlOjS1oj4gCOU3w/V_GjVcXxLN9UrKrHOH…
Call for Contributions
work-in-progress - posters - demos - tutorials - doctoral programme
11th Conference on Intelligent Computer Mathematics
- CICM 2018 -
August 13-17, 2018
RISC, Hagenberg, Austria
http://www.cicm-conference.org/2018
--------------------------------------------------------------------------------
CICM focuses on theoretical and practical solutions for mathematical
applications such as computation, deduction, knowledge management, libraries,
and user interfaces.
CICM 2018 will feature 3 invited speakers:
* Akiko Aizawa, National Institute of Informatics, University of Tokyo
* Bruno Buchberger, Research Institute for Symbolic Computation, Johannes Kepler
University
* Adri Olde Daalhuis, University of Edinburgh
and 6 affiliated workshops:
* Computer Algebra in the age of Types
* Computer Mathematics in Education - Enlightenment or Incantation
* Formal Mathematics for Mathematicians
* Formal Verification of Physical Systems
* Mathematical Models and Mathematical Software as Research Data
* OpenMath Workshop
In addition to the above and the formally reviewed program, CICM invites
contributions of various forms:
1) Work-in-progress papers (any length up to 15 pages) will be lightly reviewed
and, if accepted, presented at the conference and published in a volume of
CEUR-WS.
2) Demos and posters (submitted as a summary of a few paragraphs) will be
presented during a dedicated session.
Submission is open to any topic interesting to the CICM audience. We also
encourage authors of accepted papers to supplement their talk with a demo or
poster.
3) Tutorials (up to 60 minutes, submitted as a summary of a few paragraphs) will
take place in individual rooms during a dedicated session.
4) Doctoral program submissions (2 page abstract + CV) will be presented and
discussed during a dedicated session focusing on mentoring.
Submission is open to any doctoral student in the CICM area. Student authors
of accepted papers are strongly encouraged to additionally submit to the
doctoral program.
Financial support for doctoral program participants is available.
All submissions should be made via easychair at
https://easychair.org/conferences/?conf=cicm2018
Notifications are sent on a rolling basis.
The last day to submit is July 15.
Hey everyone,
we're hosting a dedicated session in June on our joint work with Cornell
and Jigsaw on predicting conversational failure
<https://arxiv.org/abs/1805.05345> on Wikipedia talk pages. This is part of
our contribution to WMF's Anti-Harassment program.
The showcase
<https://www.mediawiki.org/wiki/Wikimedia_Research/Showcase#June_2018> will be
live-streamed <https://www.youtube.com/watch?v=m4vzI0k4OSg> on *Monday,
June 18, 2018* at 11:30 AM (PDT), 18:30 (UTC). (Please note this falls on
a Monday this month).
Conversations Gone Awry. Detecting Early Signs of Conversational
FailureBy *Justine
Zhang and Jonathan Chang, Cornell University*One of the main challenges
online social systems face is the prevalence of antisocial behavior, such
as harassment and personal attacks. In this work, we introduce the task of
predicting from the very start of a conversation whether it will get out of
hand. As opposed to detecting undesirable behavior after the fact, this
task aims to enable early, actionable prediction at a time when the
conversation might still be salvaged. To this end, we develop a framework
for capturing pragmatic devices—such as politeness strategies and
rhetorical prompts—used to start a conversation, and analyze their relation
to its future trajectory. Applying this framework in a controlled setting,
we demonstrate the feasibility of detecting early warning signs of
antisocial behavior in online discussions.
Building a rich conversation corpus from Wikipedia Talk pagesWe present a
corpus of conversations that encompasses the complete history of
interactions between contributors to English Wikipedia's Talk Pages. This
captures a new view of these interactions by containing not only the final
form of each conversation but also detailed information on all the actions
that led to it: new comments, as well as modifications, deletions and
restorations. This level of detail supports new research questions
pertaining to the process (and challenges) of large-scale online
collaboration. As an example, we present a small study of removed comments
highlighting that contributors successfully take action on more toxic
behavior than was previously estimated.
YouTube stream: https://www.youtube.com/watch?v=m4vzI0k4OSg
As usual, you can join the conversation on IRC at #wikimedia-research. And,
you can watch our past research showcases here
<https://www.youtube.com/playlist?list=PLhV3K_DS5YfLQLgwU3oDFiGaU3K7pUVoW>.
Hope to see you there on June 18!
Dario
Hi Everyone,
I've been performing some analysis on the "thank a user for an edit"
feature which was introduced in 2013, but have run into a data availability
hurdle. I'm able to easily retrieve all the "thanks" that happened using
the database replicas by searching the logging table with "log_type =
'thanks'" criteria. However these entries only shows who thanked who, and
when. I don't see recorded which *revision* was being thanked. Does anyone
know where I might find this data?
Make a great day,
Max Klein ‽ http://notconfusing.com/
Alright, we are now ready to do this. New date: June 14 2018, around
14:00 UTC.
On Tue, Jun 5, 2018 at 9:31 AM, Andrew Otto <otto(a)wikimedia.org> wrote:
> Hi,
>
> We need to delay this switchover. We’ve made some changes to the plan
> that require a bit more prep work. Follow https://phabricator.
> wikimedia.org/T185225 for more details.
>
> I’ll reply with another announcement email when we set a new date.
>
> - Andrew Otto
> Senior Systems Engineer, WMF
>
>
>
> On Tue, May 15, 2018 at 11:43 AM, Andrew Otto <otto(a)wikimedia.org> wrote:
>
>> Hi all!
>>
>> *If you are not an active user of the EventStreams service, you can
>> ignore this email.*
>>
>> We’re in the process of upgrading
>> <https://phabricator.wikimedia.org/T152015> the backend infrastructure
>> that powers the EventStreams service. When we switch EventStreams to
>> the new infrastructure <https://phabricator.wikimedia.org/T185225>, the
>> ‘offsets’ AKA Last-Event-IDs will change.
>>
>> Connected EventStreams SSE clients will reconnect and not be able to
>> automatically consume from the exact position in the stream where they left
>> off. Instead, reconnecting clients will begin consuming from the latest
>> messages in the stream. This means that connected clients will likely miss
>> any messages that occurred during the reconnect period. Hopefully this
>> will be a very small number of messages, as your SSE client should
>> reconnect quickly.
>>
>> This switch is scheduled to happen on June 5 2018, at around 17:30 UTC.
>>
>> Let us know if you have any questions.
>>
>> Thanks!
>> - Andrew Otto
>> Senior Systems Engineer, WMF
>>
>>
>>
>>
>>
>>
>>
>>
>
Hello,
Lucas (cc'd) and I have been working on tools and corpora that transform wiki
revision dumps into structured conversations; as part of this, we want to
make sure any down-stream services and corpora that we develop respect the
deleted (and suppressed) revisions; namely that we remove any copies we have
of things deleted on Wikipedia.
For that we need a way to:
1. get all revisions IDs that were deleted or suppressed (or all non-deleted
and non-suppressed ones)
2. have a way to get new deletions or suppresions so that we can remove any
copies that we have.
What's the right infrastructure/APIs to use for this?
Thanks!
Hi all,
We just added a new formal collaboration with Ramtin Yazdanian from
EPFL to develop, design and test models that can help us learn about
New Editor Interests. While the applications of these models are
numerous, we expect to use them at least in the line of research in
addressing Wikipedia contributor diversity gaps.
This research is aimed to address the cold start problem in Wikimedia
projects, when a user enters the system and you have almost no
information about the user and yet you want to engage with the user in
the areas they're interested in. Please see project details at
https://meta.wikimedia.org/wiki/Research:Voice_and_exit_in_a_voluntary_work…
Best,
Leila
--
Leila Zia
Senior Research Scientist, Lead
Wikimedia Foundation