Hi Luca,

well, given that you are already have to deal with Hive today, just to report back that I have had a few situations with the HS2 server rejecting my queries in the previous days, reporting back that the most likely reason is the number of open connections. I guess some defensive programming in my R scripts will take care of running the queries when the rush is not that dense, however, nothing similar has ever happened in the previous months, so I wanted to report back.

You know that I'm not in Data Engineering so I don't have a clue whether this has or does not have to do with the HS2 settings as they were planned by Analytics-Engineering. Maybe nothing needs to be changed. Just wanted to let you know.

Good luck with the daemon.


Goran S. Milovanović, PhD
Data Scientist, Software Department
Wikimedia Deutschland

"It's not the size of the dog in the fight,
it's the size of the fight in the dog."
- Mark Twain

On Thu, Dec 7, 2017 at 12:36 PM, Luca Toscano <ltoscano@wikimedia.org> wrote:
Hi everybody,

we are experiencing some issues with the Hive daemon, so currently Hive queries are not available. I am going to update this thread as soon as the issue is over.

For more info, please contact me (elukey) on IRC (#wikimedia-analytics).

Sorry for the trouble!


2017-12-06 19:47 GMT+01:00 Luca Toscano <ltoscano@wikimedia.org>:
Hi everybody,

we'd need to reboot the analytics1003 host for Linux kernel and openjdk updates tomorrow Dec 07 at 10 AM CET. Hive and Oozie will stop for a (hopefully) brief amount of time, but since they'll need to stop before the reboot it might happen that in flight jobs/queries fail. We'll try to avoid the reboot if too many jobs are running, but at some point we'll need to pull the trigger.

Please let me know on IRC (#wikimedia-analytics, elukey) or via email if you have any issue with this maintenance.

Thanks and sorry for the trouble!

Luca (on behalf of the Analytics team) 

Engineering mailing list