Hi all!
I’d like to announce that we’ve done a bit of work to make Jupyter
Notebooks in SWAP <https://wikitech.wikimedia.org/wiki/SWAP> support Spark
kernels. This means that you can now run Spark shells in both local mode
(on the notebook server) or YARN mode (distributed on the Hadoop Cluster)
inside of a Jupyter notebook. You can then take advantage of fancy Jupyter
plotting libraries to make graphs directly from data in Spark.
See
https://wikitech.wikimedia.org/wiki/SWAP#Spark for documentation.
This is a new feature, and I’m sure there will be kinks to work out. If
you encounter issues of have questions, please respond on this phabricator
ticket <https://phabricator.wikimedia.org/T190443>, or create a new one and
add the Analytics tag.
Enjoy!
-Andrew Otto & Analytics Engineering