Whoa—I just got the same stopped SparkContext error on the query even after restarting the notebook, without an intermediate Java heap space error. That seems very strange to me.

On Wed, 5 Feb 2020 at 16:09, Neil Shah-Quinn <nshahquinn@wikimedia.org> wrote:
Hey there!

I was running SQL queries via PySpark (using the wmfdata package) on SWAP when one of my queries failed with "java.lang.OutofMemoryError: Java heap space".

After that, when I tried to call the spark.sql function again (via wmfdata.hive.run), it failed with "java.lang.IllegalStateException: Cannot call methods on a stopped SparkContext."

When I tried to create a new Spark context using SparkSession.builder.getOrCreate (whether using wmfdata.spark.get_session or directly), it returned a SparkContent object properly, but calling the object's sql function still gave the "stopped SparkContext error".

Any idea what's going on? I assume restarting the notebook kernel would take care of the problem, but it seems like there has to be a better way to recover.

Thank you!