I am still having a hard time with an error on my Spark job but the error I was getting from the spark-submit script what that it failed because the assembly jar was not available.
When looking at my issue I read this article that mentioned that you can put the assembly jar in HDFS and specify it as an environment variable. It helps the job submission since it does not have to start by copying the jar to HDFS. The documentation on the Spark site mentioned doing doing an environment variable but I simply added it to the spark-env.sh configuration file. It does the same thing.
Side effect is that the error from the console is more informative that a real issue with the app is going on. I don’t get the odd error that the jar is not available but:
Container exited with a non-zero exit code 1
.Failing this attempt.. Failing the application.