Hi everybody,
some news from the Analytics team:
- The Kerberos ticket expiry time has been bumped to 48h. You can
kdestroy/kinit to get the new settings :)
- There are new memory and cpu limits on all stat/notebook hosts, that
should automatically kill big jobs that cause too much memory pressure. CPU
cores are also limited to 90% of the available ones to leave space for
system daemons. This should help a lot in avoiding recurrent alarms to the
SRE team (and me reaching out to some of you as consequence!) and it should
be a more fair system for everybody. In order to apply these new settings
I'd need to shutdown/start all the notebooks running on notebook1003/1004,
but I didn't do it since I didn't want to impact any work. If you could
please take care of stopping/starting your notebooks it would be really
appreciated :)
- We deployed jupyterhub on stat1004 and stat1006, ready for general use!
This should help in avoiding the small home size problem that many of you
are experiencing on notebook1003/1004. We are also working on setting up
jupyterhub on stat1005, with updated dependencies (jupyterhub 1.1.0, toree
0.3.0, etc.. full list in
The plan is to eventually have the same version on all stat boxes (no
timeline yet). We didn't deploy jupyterhub on stat1007 due to some puppet
code refactoring in progress, but we hope to do it next quarter.
- A new stat host (stat1008) will be ready for general use soon. It hosts a
GPU like stat1005.
If you have questions/doubts/etc.. please feel free to follow up with me or
any member of the Analytics team on #wikimedia-analytics :)
Luca (on behalf of the Analytics team)
Hi everybody,
as part of https://phabricator.wikimedia.org/T246578 we'd like to enforce
some basic permissions via puppet to all the home directories on analytics
clients (stat/notebooks) of analytics-privatedata-users to
$user:analytics-privatedata-users 750. For example, let's pick my home,
- it will get permissions elukey:analytics-privatedata-users (owner:group)
- it will get permissions set to 750
I am talking about only the home directory, not its content (so the
permissions will not be applied recursively). In this way we'd like to
protect PII data that people might copy from Hadoop to the local file
system, allowing only users from analytics-privatedata-users to read
between each other home dirs.
If for any reason this change impacts your work, please let us know in the
aforementioned task. In theory this should not affect anybody, and keep our
data a little bit more safe :)
Luca (on behalf of the Analytics team)