The most simple way of submitting a job to Slurm is by creating a script for your job, then submitting that script by using the program sbatch
.
For example, let's say I have the following script, named myjob.sh
, which simply prints out the text "Hello, world!":
#!/bin/bash
echo "Hello, world!"
I could then submit this to Slurm by running sbatch myjob.sh
. Slurm will give us the ID of the job, for example:
Submitted batch job 8985982
Slurm will receive the script, determine which of the computing nodes are best suited for runnning this script, send the script to that node and run it, then when the program is finished running, it will output a file in the format slurm-<ID>.out
with whatever the program outputs. In this case, I should see a file named slurm-8985982.out
with the contents Hello, world!
.
In this case, we provide no configuration options to sbatch
, so it will submit the job with default options. However, sbatch
has options for specifying things like the number of nodes to submit a job to, how many CPU cores to use, and who to email regarding the status of a job. These options are useful, so when submitting a job, it's worth specifying them. sbatch
allows us to specify these options in a script by providing comments at the beginning of the script. For example:
#!/bin/bash
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --mail-type=END
#SBATCH [email protected]
###################################
echo print all system information
uname -a
echo print effective userid
whoami
echo Today date is:
date
echo Your current directory is:
pwd
echo The files in your current directory are:
ls -lt
echo Have a nice day!
sleep 20
SBATCH -N 1
- Specifies the number of nodes Slurm should submit the job to. You must always set -N 1
unless you receive special instruction from the system admin.
SBATCH -n 1
- Specifies the number of cores Slurm should delegate to the job. Don't specify more than 16 unless approved by the system admin.
SBATCH --mail-type=END
- Specifies when you should receive an email notification regarding the job. Options are BEGIN
,END
,FAIL
,ALL
.
SBATCH [email protected]
- Specifies the email that should receive notifications about the status of this job.
To submit your job to a specific node, use the following command:
sbatch -w [cX] [script_file]
Where X is a number specifying the node you intend to use. For example, the following command will submit myjob.sh
to node c5
:
sbatch -w c5 myjob.sh
To submit your job while excluding nodes (for example exclude nodes c5 to c22, use the following command:
sbatch --exclude c[5-22] myjob.sh
The same flags are applicable when running srun
.
You can also add these flags to your script as SBATCH comments instead of submitting them as command line flags. For example, to submit a script to sbatch
which you'd like to be submitted to node c5
, you'd write:
#!/bin/bash
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --mail-type=END
#SBATCH [email protected]
#SBATCH -w c5
###################################
echo print all system information
uname -a
echo print effective userid
whoami
echo Today date is:
date
echo Your current directory is:
pwd
echo The files in your current directory are:
ls -lt
echo Have a nice day!
sleep 20
To submit your jobs to a specific partition, use the following command:
sbatch -p [partition name]
Where [partition name]
is one of: normal
, bcc
, kellis
. If no partition is provided, Slurm defaults to normal
.
You can also add this flag to your script as SBATCH comments instead of submitting them as command line flags. For example, to submit a script to sbatch
which you'd like to be submitted to node c5
, you'd write:
#!/bin/bash
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --mail-type=END
#SBATCH [email protected]
#SBATCH -p bcc
###################################
echo print all system information
uname -a
echo print effective userid
whoami
echo Today date is:
date
echo Your current directory is:
pwd
echo The files in your current directory are:
ls -lt
echo Have a nice day!
sleep 20
The same flag is applicable when running srun
.
To monitor the progress of your job use the command squeue
. To display information relative only to the jobs you submitted, use the following (where username is your username):
squeue -u username
To cancel a job that you are running, your can use the scancel
command and pass it your job ID:
scancel <JOB ID>
scancel 1234567
You can check the general status the computing nodes by using the sinfo
command, which will display the following:
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
normal* up 14-00:00:0 1 drain* c8
normal* up 14-00:00:0 1 drain c39
normal* up 14-00:00:0 18 mix c[5-7,9-14,16-19,21-22,24-26]
normal* up 14-00:00:0 1 alloc c40
normal* up 14-00:00:0 14 idle c[15,20,27-38]
bcc up 14-00:00:0 5 mix b[13-16],c2
bcc up 14-00:00:0 4 idle b17,c[1,3-4]
kellis up 28-00:00:0 12 mix b[1-12]
We've also provided a custom command, nodeInf
, that will give more detailed information about each node. For example:
NODELIST PARTITION CPUS(A/I/O/T) CPU_LOAD FREE_MEM MEMORY STATE
b1 kellis 82/14/0/96 145.58 405507 768000 mixed
b2 kellis 84/12/0/96 302.81 504957 768000 mixed
...
c1 bcc 0/32/0/32 0.01 127590 128000 idle
c2 bcc 8/24/0/32 15.34 380 128000 mixed
c3 bcc 0/32/0/32 0.01 127534 128000 idle
c4 bcc 0/32/0/32 0.01 127505 128000 idle
c5 normal* 2/14/0/16 0.01 127561 128000 mixed
c6 normal* 2/14/0/16 18.90 344 128000 mixed
...
NODELIST
- The name of the node
PARTITION
- The partition which the node belongs to
CPUS (A/I/O/T)
- Enumerates what CPUs are Allocated/Idle/Other/Total # of CPUS
CPU_LOAD
- The load on the CPU
FREE_MEM
- The amount of free RAM on the node
MEMORY
- The total amount of RAM on the node
STATE
- the state of the node. Idle means it's not in use, mixed means it's in use but still has resources available, drained means it's fully in use.
While submitting a single script to Slurm can be useful, sometimes it's useful to quickly tests programs in the command line. How can you do that while still taking advantage of the compute nodes' processing power?
Slurm has a built-in utility for doing just this, srun
. To run an interactive session, run:
srun --pty bash
This will assign you to a compute node, then start a bash shell session on that node. You can now run programs interactively on the command line, just as if you were on the head node. This is often useful when you are compiling, debugging, or testing a program, and the program does not take long to finish.
Remember to exit cleanly from interactive sessions when done; otherwise it will be killed without your notice.
To solve problems 1. and 2. we use a program called Slurm on our cluster. Slurm is a "job scheduler"; essentially, it receives "jobs" and then sends them to computing nodes in a way that utilizes resources in the most efficient way possible.
You never want to run any resource-intensive program on the head node. Always delegate resource-intensive jobs to Slurm so that it will send your job to a computing cluster. This benefits you and all other users, as it gives your job more processing power, and leaves the head node's processing power free for Slurm to delegate people's jobs.
Our Luria cluster has the following nodes:
c1-4
16 cores
Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz
c5-40
8 cores
Intel(R) Xeon(R) CPU E5620 @ 2.40GHz
b1-16
48 cores
Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz or 5220R CPU @ 2.20GHz
These nodes are organized into the following partitions:
b1-12
b13-17, c1-4
c5-40
The kellis
and bcc
partitions are reserved for their respective labs. The normal
partition is the default partition that can be used by any lab. You should never use the kellis
and bcc
partitions unless you have been given express permission to do so.