Environment Modules

Environment modules are bundles that hold a program and all the shell environment information that the program needs to run. When these modules are loaded by a user, that user's shell environment will be modified to include the environment in the module. When the module is unloaded, the module's environment is cleanly removed and no changes are made to the user's original environment.

You can see what modules are available on Luria by running:

module avail

You'll notice we have modules for common scientific programs, such as bwa or R. You'll also notice that there are multiple versions of each of these programs.

Since the modules are completely separate from one another, there can be modules for different versions of the same software without any conflict between the two versions. This way, the needs of multiple users can be satisfied.

To load a module, you'd run:

module load <module name>/<module version>

So to load R v3.4.3:

module load r/3.4.3

Now that the module is loaded, any programs in the module should be available on the command line. So after loading the R module, you can run R.

Once you no longer need to use the module, you unload it by running:

module del <module name>

So:

module del r

R will no longer be available on the command line.

Modules in Slurm

If you have a script, for example, an RScript that you want to submit to sbatch for processing, you'll have to make sure to write a script that first loads in the appropriate module, then runs the program or script you wanted to run.

For example, write a script named myRjob.sh:

#!/bin/bash
#SBATCH -N 1
#SBATCH -n 1
#SBATCH --mail-type=END	
#SBATCH --mail-user=example@mit.edu
###################################

module load r/3.4.3

Rscript /path/to/your/script.R

This script makes sure that the R module is loaded before running the R script. Now, when you submit the job to Slurm with sbatch myRjob.sh, it will run correctly.

Example script 1:

#!/bin/bash
#SBATCH -N 1 # Number of nodes. You must always set -N 1 unless you receive special instruction from the system admin
#SBATCH -n 8 # Number of tasks. Don't specify more than 16 unless approved by the system admin

module load fastqc/0.11.5
module load bwa/0.7.17
mkdir -p ~/data/class
cd ~/data/class
fastqc -o ~/data/class /net/rowley/ifs/data/dropbox/test_1.fastq
bwa mem -t8 -f ex1.sam /home/Genomes/bwa_indexes/mm10.fa /net/rowley/ifs/data/dropbox/UNIX/test_1.fastq

Example script 2:

#!/bin/bash
#SBATCH -N 1                      # Number of nodes. You must always set -N 1 unless you receive special instruction from the system admin
#SBATCH -n 16                     # Number of taks. Don't specify more than 16 unless approved by the system admin

module load fastqc/0.11.5
module load bwa/0.7.17
FILE=$1
WORKDIR=~/data/class
mkdir -p $WORKDIR
cd $WORKDIR
fastqc -o $WORKDIR $FILE
bwa mem -t16 -f $(basename $FILE).sam /home/Genomes/bwa_indexes/mm10.fa $FILE

Example script 3:

#!/bin/bash

#SBATCH -N 1 
#SBATCH -n 4
#SBATCH --array=1-2

module load fastqc/0.11.5
module load bwa/0.7.17
FASTQDIR=/net/rowley/ifs/data/dropbox/
WORKDIR=~/data/class
mkdir -p $WORKDIR
cd $WORKDIR
FILE=$(ls $FASTQDIR/*.fastq | sed -n ${SLURM_ARRAY_TASK_ID}p)
fastqc -o $WORKDIR $FILE
bwa mem -t4 -f $(basename $FILE).sam /home/Genomes/bwa_indexes/mm10.fa $FILE

Last updated