Thanks, Luca!
If you use wmfdata-python <https://github.com/wikimedia/wmfdata-python> or
wmfdata-r <https://github.com/wikimedia/wmfdata-r>, they should *not *be
affected by the changed Hive endpoint as they pick it up from the shell
environment. However, if you notice anything breaking on Monday, please
contact my team at product-analytics(a)wikimedia.org.
On Thu, 17 Dec 2020 at 14:31, Luca Toscano <ltoscano(a)wikimedia.org> wrote:
Hi everybody,
On Monday 21st we'd like to reboot all stat100x hosts for Linux kernel
upgrades at around 9 AM CET. This means that all the notebooks and various
activities running on those nodes will be stopped for a brief amount of
time. To repay your patience, two things will be added:
- A shared kerberos credential cache with notebooks. This practically
means that you will only be required to kinit once (either after doing ssh
to stat100x or in a Jupyter notebook), and the credentials will be shared
(no more double kinit etc..). It is already "live" on stat1004 if you want
to test it! Since the new shared credential will have a new location on
disk, all kerberos sessions will be destroyed and you'll have to kinit
again when the reboots are completed. More details in
https://phabricator.wikimedia.org/T255262.
- A new endpoint for Hive called 'analytics-hive.eqiad.wmnet', that should
replace hive jdbc/metastore configs hardcoding an-coord1001.eqiad.wmnet
(and allow us to failover transparently if needed without requesting job
restarts etc..). The side effect of this is that all hive-related tools
will change configs (transparently for external users). If you have any
script that points directly to hive via JDBC (for example a Python script
using PyHive etc..) please update it with the new endpoint.
If this schedule impacts your work, please ping me via email/IRC/etc.. and
I'll try to reschedule accordingly :)
Thanks!
Luca (on behalf of Analytics / Data Engineering)
--
Analytics-announce mailing list
Analytics-announce(a)lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/analytics-announce
--
Neil Shah-Quinn
senior data scientist, Product Analytics
<https://www.mediawiki.org/wiki/Product_Analytics>
Wikimedia Foundation <https://wikimediafoundation.org/>