For all Hive users using stat1002/1004, you might have seen a deprecation
warning when you launch the hive client - that claims it's being replaced
with Beeline. The Beeline shell has always been available to use, but it
required supplying a database connection string every time, which was
pretty annoying. We now have a wrapper
setup to make this easier. The old Hive CLI will continue to exist, but we
encourage moving over to Beeline. You can use it by logging into the
stat1002/1004 boxes as usual, and launching `beeline`.
There is some documentation on this here:
If you run into any issues using this interface, please ping us on the
Analytics list or #wikimedia-analytics or file a bug on Phabricator
(If you are wondering stat1004 whaaat - there should be an announcement
coming up about it soon!)
The Analytics team would like to announce that we have migrated the
reportcard to a new domain:
The migrated reportcard includes both legacy and current pageview data,
daily unique devices and new editors data. Pageview and devices data is
updated daily but editor data is still updated ad-hoc.
The team is working at this time on revamping the way we compute edit data
and we hope to be able to provide monthly updates for the main edit metrics
this quarter. Some of those will be visible in the reportcard but the new
wikistats will have more detailed reports.
You can follow the new wikistats project here:
just wanted to let you know that we have stopped the Eventlogging Mysql
Kafka consumers on eventlog1001 for
https://phabricator.wikimedia.org/T183123. They will be re-enabled as soon
as outlined in https://phabricator.wikimedia.org/T181518 the Analytics team
needs to repurpose the notebook1002 host (one of the PAWS/Jupyter nodes) as
Kafka Analytics broker for a urgent maintenance procedure. We are not aware
of anybody actively using it (as it happens with notebook1001) but to be on
the safe side all the home directories will be saved on notebook1001's /srv
directory in case somebody needs that data.
We are in the process of ordering new hardware to replace the current
notebook1001 and 1002 hosts, so the absence of notebook1002 will be only
Luca (on behalf of the Analytics team)
The next Research Showcase will be live-streamed this Wednesday, December
13, 2017 at 11:15 AM (PST) 18:15 UTC.
YouTube stream: https://www.youtube.com/watch?v=OoVwus1Owtk
As usual, you can join the conversation on IRC at #wikimedia-research. And,
you can watch our past research showcases here.
This month's presentation:
*The State of the Article Expansion Recommendation System*
By Leila Zia
Only 1% of English Wikipedia articles are labeled with quality class Good
or better, and 37% of the articles are stubs. We are building an article
expansion recommendation system to change this in Wikipedia, across many
languages. In this presentation, I will talk with you about our current
thinking of the vision and direction of the research that can help us build
such a recommendation system, and share more about one specific area of
research we have heavily focused on in the past months: building a
recommendation system that can help editors identify what sections to add
to an already existing article. I present some of the challenges we faced,
the methods we devised or used to overcome them, and the result of the
first line of experiments on the quality of such recommendations (teaser:
the results are really promising. The precision and recall at 10 is 80%.)
Project Assistant, Engineering Admin
we'd need to reboot the analytics1003 host for Linux kernel and openjdk
updates tomorrow Dec 07 at 10 AM CET. Hive and Oozie will stop for a
(hopefully) brief amount of time, but since they'll need to stop before the
reboot it might happen that in flight jobs/queries fail. We'll try to avoid
the reboot if too many jobs are running, but at some point we'll need to
pull the trigger.
Please let me know on IRC (#wikimedia-analytics, elukey) or via email if
you have any issue with this maintenance.
Thanks and sorry for the trouble!
Luca (on behalf of the Analytics team)