I have written a program example.jar
which uses a spark context. How can I run this on a cluster which uses Slurm? This is related to https://stackoverflow.com/questions/29308202/running-spark-on-top-of-slurm but the answers are not very detailed and not on serverfault.
In order to run an application using a spark context it is first necessary to run a Slurm job which starts a master and some workers. There are some things you will have to watch out for when using Slurm:
I'm working with the Linux binaries installed to
$HOME/spark-1.5.2-bin-hadoop2.6/
. Remember to replace<username>
and<shared folder>
with some valid values in the script.Now to start the sbatch job and after that
example.jar
:As maxmlnkn answer states, you need a mechanism to setup/launch the appropriate Spark daemons in a Slurm allocation before a Spark jar can be executed via spark-submit.
Several scripts/systems to do this setup for you have been developed. The answer you linked above mentions Magpie @ https://github.com/LLNL/magpie (full disclosure: I'm the developer/maintainer of those scripts). Magpie provides a job submission file (submission-scripts/script-sbatch-srun/magpie.sbatch-srun-spark) for you to edit and put your cluster specifics & job scripts in to execute. Once configured you'd submit this via 'sbatch -k ./magpie.sbatch-srun-spark'). See doc/README.spark for more details.
I will mention there are other scripts/systems to do this for you. I lack experience with them, so can't comment beyond just linking them below.
https://github.com/glennklockwood/myhadoop
https://github.com/hpcugent/hanythingondemand