slurm-seq.sh
sbatch --jobname=<myjobname> slurm-seq.sh ./sequential_code <arg1 arg2 argn>
<myjobname> : Name of your chice to be the name of job which will be used as the name of the output file (myjobname_out.jobID
) and error file (myjobname_err.jobID
).
slurm-openmp.sh
export OMP_PROC_BIND=<true, false, close, spread>
sbatch --job-name=<myjobname> --export=ALL --cpus-per-task=<num_threads> slurm-openmp.sh ./openmp_code <arg1 arg2 argn>
<myjobname> : Name of your chice to be the name of job which will be used as the name of the output file (myjobname_out.jobID
) and error file (myjobname_err.jobID
).
<num_threads> is a number of your choice in a range from 1 to 48. According to the Heracles architecture each node has 24 hard-cores with hyper-threads in a total of 48.
slurm-mpi.sh
Set up Intel compiler environment:
source /opt/intel/oneapi/setvars.sh
sbatch --job-name=<myjobname> --nodes=<num_nodes> --ntasks=<num_tasks> slurm-mpi.sh ./mpi_code <arg1 arg2 argn>
<myjob-name> : Name of your chice to be the name of job which will be used as the name of the output file (myjobname_out.jobID
) and error file (myjobname_err.jobID
).
<num_nodes> Setup a total of N nodes for the job. MPI tasks will be distributed across these nodes
<num_tasks> Specifies the total of MPI tasks. These tasks will be distributed across the requested nodes. If you want to know home many tasks per node will be launched by slurm, just divide --ntasks by --nodes
slurm-mpi-omp.sh
Set up Intel compiler environment:
source /opt/intel/oneapi/setvars.sh
export OMP_PROC_BIND=true or false or close or spread
sbatch --export=ALL --job-name=<myjobname> --cpus-per-task=<num_threads> --nodes=<num_nodes> --ntasks=<num_tasks> slurm-mpi-omp.sh. /mpiomp_code <arg1 arg2 argn>
<myjobname> Name of your chice to be the name of job which will be used as the name of the output file (myjobname_out.jobID
) and error file (myjobname_err.jobID
).
<num_threads> is a number of your choice in a range from 1 to 48. According to the Heracles architecture each node has 24 hard-cores with hyper-threads in a total of 48.
<num_nodes> Setup a total of N nodes for the job. MPI tasks will be distributed across these nodes
<num_tasks> Specifies the total of MPI tasks. These tasks will be distributed across the requested nodes. If you want to know home many tasks per node will be launched by slurm, just divide --ntasks by --nodes
slurm-gpu.sh
sbatch --job-name=<myjobname> slurm-gpu.sh ./cuda_code <arg1 arg2 argn>
sbatch --job-name=<myjobname> slurm-gpu.sh nsys profile --trace=cuda,nvtx,osrt ./cuda_code <arg1 arg2 argn>
sbatch --job-name=<myjobname> slurm-gpu.sh nsys nvprof --print-gpu-trace ./cuda_code
<myjobname> Name of your chice to be the name of job which will be used as the name of the output file (myjobname_out.jobID
) and error file (myjobname_err.jobID
).
The job may not be displayed in the queue because it runs very fast. To check the queue, use:
squeue
To check the error file, use the following command (replace <jobID>
with your actual job ID):
cat codename_err.<jobID>
To check the output file, use the following command (replace <jobID>
with your actual job ID):
cat codename_out.<jobID>