Hi friends!
Spark 1.x is pretty old. We only keep it around because it is a standard part of the Cloudera distribution we use in the analytics Hadoop cluster. The Analytics Engineering team uses Spark 2 for all of our jobs, and you should too!
Spark 2 has been available in our cluster for over a year now. If you don't yet use it, see https://wikitech.wikimedia.org/w/index.php?title=Analytics/Systems/Cluster/S... for more info on how to.
We'd like to remove Spark 1 during the week of February 11. Please migrate any Spark 1 jobs to Spark 2 by then (if there are any left!). (If this timeline doesn't work for you just let us know and we'll adjust.)
Thanks! - Andrew Otto & Analytics Engineering
Hi again!
This has been done. Documentation has been updated at https://wikitech.wikimedia.org/wiki/Analytics/Systems/Cluster/Spark.
On Tue, Jan 22, 2019 at 10:17 AM Andrew Otto otto@wikimedia.org wrote:
Hi friends!
Spark 1.x is pretty old. We only keep it around because it is a standard part of the Cloudera distribution we use in the analytics Hadoop cluster. The Analytics Engineering team uses Spark 2 for all of our jobs, and you should too!
Spark 2 has been available in our cluster for over a year now. If you don't yet use it, see https://wikitech.wikimedia.org/w/index.php?title=Analytics/Systems/Cluster/S... for more info on how to.
We'd like to remove Spark 1 during the week of February 11. Please migrate any Spark 1 jobs to Spark 2 by then (if there are any left!). (If this timeline doesn't work for you just let us know and we'll adjust.)
Thanks!
- Andrew Otto & Analytics Engineering