Hello!
tl;dr: We'd like to turn off Jupyter+Virtualenv (SWAP) in favor of
Jupyter+Conda (Newpyter) the week of May 3rd. Please help us test and
switch before then.
Over the last year, we've slowly been working on a replacement of the
current virtualenv based JupyterHub system (formerly known as SWAP) with a
new one based on Conda <https://docs.conda.io/en/latest/> (AKA Newpyter).
Everything should be in place to switch and decommission the virtualenv
based system you all are used to. Before we do...we have to make sure you
all use and are ok with the new setup!
We'd like to decommission Jupyter+Virtualenv (running on port 8000) the
week of May 3rd. In the meantime, please switch to Jupyter+Conda on port
8880. The documentation has been updated.
<https://wikitech.wikimedia.org/wiki/Analytics/Systems/Jupyter>
Summary of the changes:
- You will ssh tunnel to port 8880
<https://wikitech.wikimedia.org/wiki/Analytics/Systems/Jupyter#Access>
instead of port 8000.
- Your Notebook files will remain unchanged.
- Your local data files will remain unchanged.
- Your Python environment will change, so you may need to re-install
packages. See docs here
<https://wikitech.wikimedia.org/wiki/Analytics/Systems/Jupyter#Conda_environ…>
and here
<https://wikitech.wikimedia.org/wiki/Analytics/Systems/Anaconda#Installing_p…>
.
- PySpark, Scala Spark and Spark SQL and Spark-R kernels will be
removed. If you use the PySpark kernels currently, please port them to a
regular Python kernel using wmfdata-python to launch your SparkSession.
Docs here
<https://wikitech.wikimedia.org/wiki/Analytics/Systems/Jupyter#PySpark>.
Please reach out with any questions, and report issues on this ticket
<https://phabricator.wikimedia.org/T224658>. If we encounter any blockers
along the way, we will postpone the May 3rd deadline.
Thank you!
- Andrew Otto + Data Engineering
Hello!
Today I deployed a change that upgraded the anaconda-wmf
<https://wikitech.wikimedia.org/wiki/Analytics/Systems/Anaconda> release
across the analytics cluster. For most of you, this will be a no-op, as it
mostly contains improvements to the way conda environments are created, and
improves some integration with Spark in Jupyter notebooks.
This does update some dependencies in the anaconda-wmf base conda
environment, so if you see any weirdness with your existing conda
environments, let me know on https://phabricator.wikimedia.org/T224658.
Thanks! Stay tuned for more announcements about JupyterHub. :)
-Andrew Otto
SRE, Data Engineering