Parallel / MPI Job
Running a simple MPI test job
This job is very simple. It just prints a hello world message from all the cores it runs on.
There are sample job submission scripts available to you in the following path
/opt/examples/slurm. We will copy one of these to our home directory, and submit it in this example.
$ cp /opt/examples/slurm/mpi-job.sh ~/mpi-job.sh
New sbatch options for parallel jobs
|1||The number of cores to run the job on. This will pick X cores from a set of machines, it will not guarantee placement of those tasks on specific nodes.|
|none||The number of tasks per node (use in combination with --nodes).|
|none||The number of nodes, for example --nodes=2 --ntasks-per-node=10 would run on 20 cores total, 10 on each node.|
#!/bin/bash # Which partition/queue does this job need to run in. Default is 'hsw-fdr' #SBATCH --partition=hsw-fdr # How long does my job have to run (HH:MM:SS), # without this option limit is # 5min #SBATCH --time=01:00:00 # How many cores should I run my job on; for mpijobs jobs this should be the # number of you'd pass to mpirun (i.e. mpirun -np X). If not specified the # default is 1 #SBATCH --ntasks=60 # This is memory need per task (see above). If not specified you will get # 3GB of RAM per cpu. #SBATCH --mem-per-cpu=1G # The descriptive name for your job. This potentially will be visible to other # users on ACTnowHPC #SBATCH --job-name=mpi_test # The name of the file to write stdout/stderr to. Use %j as a place holder # for the current job number #SBATCH --output=mpi_test-%j.out # load the mpi version you code was compiled with module load mvapich2-2.2a/gcc # issue the mpirun command with my binary, no need for -np options as scheduler # takes care of that for you mpirun /opt/examples/mpihello/mpihello-mvapich2-2.2a
Options used in this job submission script
|Run in the hsw-fdr partition|
|Run for 1 hour|
|Run the job on 60 cores|
|I will need 1GB of RAM per CPU/CORE for my job (in this example 60GB total)|
|I'm naming my job "serial_test"|
|Write all the output to a file called |
To run the job issue the following command
$ sbatch mpi-job.sh Submitted batch job 1521
Check the status of the job
Check the status with squeue (more info Basic SLURM commands).
$ squeue --job 1521