More informative errors for Spark jobs

I am still having a hard time with an error on my Spark job but the error I was getting from the spark-submit script what that it failed because the assembly jar was not available.

When looking at my issue I read this article that mentioned that you can put the assembly jar in HDFS and specify it as an environment variable. It helps the job submission since it does not have to start by copying the jar to HDFS. The documentation on the Spark site mentioned doing doing an environment variable but I simply added it to the spark-env.sh configuration file. It does the same thing.

Side effect is that the error from the console is more informative that a real issue with the app is going on. I don’t get the odd error that the jar is not available but:

Container exited with a non-zero exit code 1
.Failing this attempt.. Failing the application.
appMasterHost: N/A
appQueue: default
appMasterRpcPort: -1
appStartTime: 1421956977050
yarnAppState: FAILED
distributedFinalState: FAILED
appTrackingUrl: http://hostname:8088/cluster/app/application_1421950771320_0008
appUser: hadoop

Advertisements

Published by

m5c

Java developper that loves photography and good coffee

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s