Hi all!
I just upgraded spark2 across the cluster to
Spark 2.3.0. If you are using the pyspark2*, spark2-*, etc. executables, you will now be using Spark 2.3.0.
We are moving towards making Spark 2 the default Spark for all Analytics production jobs. We don’t have a deprecation plan for Spark 1 yet, so you should be able to continue using Spark 1 for the time being. However, in order to support large yarn Spark 2 jobs, we need to upgrade the default Yarn Spark Shuffle Service to Spark 2. This means that large Spark 1 jobs may no longer work properly. We don’t know of any large productionized Spark 1 jobs other than the ones the Analytics team manages, but if you have any that you are worried about, please let us know ASAP.
-Andrew & Analytics Engineering