Running Images in Singularity

Singularity Commands

Singularity is packaged on Luria as an environment module, so you'll need to load the module in before invoking any Singularity commands. We'll also run these commands on an interactive session on a compute node so we don't spend the head node's resources.

Now, we can either have Singularity manage the image itself, or create the SIF file in our current directory. We'll do both in this exercise.

Let's run the same basic 'Hello, World!' command we did in Docker, again using the Debian Docker image:

srun --pty bash

module load singularity/3.10.4

singularity exec docker://debian echo 'Hello, World!'
INFO:    Converting OCI blobs to SIF format
INFO:    Starting build...
Getting image source signatures
Copying blob 1468e7ff95fc done
Copying config d5269ef9ec done
Writing manifest to image destination
Storing signatures
2024/05/01 11:02:47  info unpack layer: sha256:1468e7ff95fcb865fbc4dee7094f8b99c4dcddd6eb2180cf044c7396baf6fc2f
INFO:    Creating SIF file...
Hello, World!

Instead of run, Singularity uses exec to execute programs inside of a container. The image we provide is a Docker image, so we specify that to singularity by prepending the image name with docker://. Singularity will look in Dockerhub for this image. Once it finds it, it will download the image, convert it to the SIF file format, place that SIF file in ~/.singularity, then execute the given command in the container. Subsequent calls to use this image will use this downloaded image instead of downloading the image every time you want to use it.

To run an interactive session in Singularity, you could do something similar to what we did in Docker, where we simply execute bash in the container.

singularity exec docker://debian bash
Singularity>

However, Singularity has a built-in command to do this that makes the syntax much nicer.

singularity shell docker://debian
Singularity>

Singularity automatically bind-mounts your user's home directory to the container, so you'll have access to your files like normal. However, your user's ~/data folder is a symbolic link to /net/<storage server>, which is not a directory inside of the Singularity container, so this symbolic link will be broken.

Like Docker, Singularity allows you to mount directories from your computer to inside the container. To get around the symbolic link issue when running the image from your home directory, you could simply mount the /net directory on Luria to the /net directory in your container. Since the /net directory contains your lab's storage server and you're keeping the same name on the container, the symbolic link should work as normal.

singularity shell --bind /net:/net docker://debian
Singularity> ls data
# You should see the files from your storage server

SIF Files

So far, we've been letting Singularity manage images itself. However, we can also instruct Singularity to download the Docker image, create a SIF file from it, and let us handle this SIF file. We do this by using the pull command:

singularity pull docker://debian

ls

# You should see a file named debian_latest.sif

Now, instead of instructing Singularity to use the docker://debian image, we can simply point it to the debian_latest.sif file. This means a lab can make a directory of common Singularity images and lab members simply run these images instead of every lab member pulling their own image. This way is also faster than the previous method.

singularity shell debian_latest.sif
Singularity>

singularity exec debian_latest.sif echo 'Hello, World!'
Hello, World!

Running RStudio with Seurat Tools

Let's run the image we created earlier. There's a pre-built version of this image available on Dockerhub at asoberan/abrfseurat.

Singularity has trouble interpreting symbolic links, and since the ~/data/ directory in a user's home directory is a symbolic link, we'll see issues when trying to run Singularity. To remedy this, we'll run the following command once we're in the ~/data/ folder:

cd $(pwd -P)

This will change our directory to the full physical path of our current working directory. So we'll be in the same directory, but without following the symbolic link in our home directories.

Now, we can begin to run Singularity images. The Singularity program is packaged as an environment module on Luria, so you'll have to load it in first. To start an interactive session in a Singularity container, we use singularity shell <image>. In this case, the image we'll be using is the asoberan/abrfseurat:latest-x86_64 image on Dockerhub, so we'll run the following:

module load singularity/3.10.4

singularity shell docker://asoberan/abrfseurat:latest-x86_64

This will begin pulling the Docker image, convert it to a SIF file, store it in ~/.singularity, which is a symbolic link to <your lab storage server user directory>/singularity, then run an interactive session on that image. Once it's done you'll have a shell session inside the image, and you'll be able to use the tools in the Singularity image. For example, you'll be able to use R:

[asoberan@luria test]$ singularity shell docker://asoberan/abrfseurat:latest-x86_64
Singularity> R

R version 4.2.2 (2022-10-31) -- "Innocent and Trusting"
Copyright (C) 2022 The R Foundation for Statistical Computing
Platform: x86_64-pc-linux-gnu (64-bit)

R is free software and comes with ABSOLUTELY NO WARRANTY.
You are welcome to redistribute it under certain conditions.
Type 'license()' or 'licence()' for distribution details.

  Natural language support but running in an English locale

R is a collaborative project with many contributors.
Type 'contributors()' for more information and
'citation()' on how to cite R or R packages in publications.

Type 'demo()' for some demos, 'help()' for on-line help, or
'help.start()' for an HTML browser interface to help.
Type 'q()' to quit R.

> 

However, we will not be able to run RStudio on the image as it stands. This is because RStudio needs to create particular settings and database files at locations in the filesystem which are read-only in the Singularity image. To fix this, we'll need to create these directories ourselves. Below is a script that does just that, while also running the RStudio server from the Singularity image:

#!/bin/bash

#SBATCH --job-name=Rstudio       # Assign an short name to your job
#SBATCH --output=slurm.%N.%j.out     # STDOUT output file

module load singularity/3.10.4

workdir=$(python -c 'import tempfile; print(tempfile.mkdtemp())')

mkdir -p -m 700 ${workdir}/run ${workdir}/tmp ${workdir}/var/lib/rstudio-server
cat > ${workdir}/database.conf <<END
provider=sqlite
directory=/var/lib/rstudio-server
END

cat > ${workdir}/rsession.sh <<END
#!/bin/sh
export OMP_NUM_THREADS=${SLURM_JOB_CPUS_PER_NODE}
exec /usr/lib/rstudio-server/bin/rsession "\${@}"
END

chmod +x ${workdir}/rsession.sh

export SINGULARITY_BIND="${workdir}/run:/run,${workdir}/tmp:/tmp,${workdir}/database.conf:/etc/rstudio/database.conf,${workdir}/rsession.sh:/etc/rstudio/rsession.sh,${workdir}/var/lib/rstudio-server:/var/lib/rstudio-server"
export SINGULARITYENV_RSTUDIO_SESSION_TIMEOUT=0
export SINGULARITYENV_USER=$(id -un)
export SINGULARITYENV_PASSWORD=$(echo $RANDOM | base64 | head -c 20)

readonly PORT=$(python -c 'import socket; s=socket.socket(); s.bind(("", 0)); print(s.getsockname()[1]); s.close()')

cat 1>&2 <<END

1. SSH tunnel from your workstation using the following command:

   ssh -t -L 8787:localhost:${PORT} ${SINGULARITYENV_USER}@luria.mit.edu ssh -t ${HOSTNAME} -L ${PORT}:localhost:${PORT}

   and point your web browser to http://localhost:8787

2. log in to RStudio Server using the following credentials:

   user: ${SINGULARITYENV_USER}
   password: ${SINGULARITYENV_PASSWORD}

When done using RStudio Server, terminate the job by:

1. Exit the RStudio Session ("power" button in the top right corner of the RStudio window)
2. Issue the following command on the login node:

      scancel -f ${SLURM_JOB_ID}
END

singularity exec --cleanenv -H ~/data:/home/rstudio docker://asoberan/abrfseurat:latest-x86_64 /usr/lib/rstudio-server/bin/rserver \
            --server-user ${USER} --www-port ${PORT} \
            --auth-none=0 \
            --auth-pam-helper-path=pam-helper \
            --auth-stay-signed-in-days=30 \
            --auth-timeout-minutes=0 \
            --rsession-path=/etc/rstudio/rsession.sh 
printf 'rserver exited' 1>&2

Let's go through the script step-by-step to understand what it's doing.

workdir=$(python -c 'import tempfile; print(tempfile.mkdtemp())')

mkdir -p -m 700 ${workdir}/run ${workdir}/tmp ${workdir}/var/lib/rstudio-server

cat > ${workdir}/database.conf <<END
provider=sqlite
directory=/var/lib/rstudio-server
END

This part of the script uses Python to creates a temporary directory that will be populated with directories to bind-mount in the Singularity container where writable file systems are necessary.

The latter portion of the script is making a file in the temporary directory, database.conf, with the contents you see. These settings are used by RStudio to configure the database.

cat > ${workdir}/rsession.sh <<END
#!/bin/sh
export OMP_NUM_THREADS=${SLURM_JOB_CPUS_PER_NODE}
exec /usr/lib/rstudio-server/bin/rsession "\${@}"
END

chmod +x ${workdir}/rsession.sh

Here, the script makes another script in the temporary directory, rsession.sh, with the contents you see. The script sets OMP_NUM_THREADS to prevent OpenBLAS (and any other OpenMP-enhanced libraries used by R) from spawning more threads than the number of processors allocated to the job. Then it makes this script executable.

export SINGULARITY_BIND="${workdir}/run:/run,${workdir}/tmp:/tmp,${workdir}/database.conf:/etc/rstudio/database.conf,${workdir}/rsession.sh:/etc/rstudio/rsession.sh,${workdir}/var/lib/rstudio-server:/var/lib/rstudio-server"
export SINGULARITYENV_RSTUDIO_SESSION_TIMEOUT=0
export SINGULARITYENV_USER=$(id -un)
export SINGULARITYENV_PASSWORD=$(echo $RANDOM | base64 | head -c 20)

readonly PORT=$(python -c 'import socket; s=socket.socket(); s.bind(("", 0)); print(s.getsockname()[1]); s.close()')

This portion sets a couple of environment variables. The environment variables which begin with SINGULARITY will be used when we invoke the Singularity program, while the the environment variables which begin with SINGULARITYENV will be accessed inside of the Singularity image.

SINGULARITY_BIND is outlining the bind-mounts that should be created when we run the Singularity image. The bind-mounts are the temporary directories we made.

SINGULARITYENV_RSTUDIO_SESSION_TIMEOUT is setting the session timeout for RStudio. In this case, it's set not to suspend idle sessions.

SINGULARITYENV_USER is storing the user which will be used in RStudio. In this case it's ourselves.

SINGULARITYENV_PASSWORD is storing the password which will be used later in RStudio. The password is being generated using the random number generator built in to Linux.

PORT is finding an unused port number and storing it for later usage.

cat 1>&2 <<END

1. SSH tunnel from your workstation using the following command on your workstation:

   ssh -t -L 8787:localhost:${PORT} ${SINGULARITYENV_USER}@luria.mit.edu ssh -t ${HOSTNAME} -L ${PORT}:localhost:${PORT}

   and point your web browser to http://localhost:8787

2. log in to RStudio Server using the following credentials:

   user: ${SINGULARITYENV_USER}
   password: ${SINGULARITYENV_PASSWORD}

When done using RStudio Server, terminate the job by:

1. Exit the RStudio Session ("power" button in the top right corner of the RStudio window)
2. Issue the following command on the login node:

      scancel -f ${SLURM_JOB_ID}
END

This part of the script prints out information to the user so they can remember how to port-forward and what the login information for RStudio is.

singularity exec --cleanenv -H ~/data:/home/rstudio docker://asoberan/abrfseurat:latest-x86_64 /usr/lib/rstudio-server/bin/rserver \
            --server-user ${USER} --www-port ${PORT} \
            --auth-none=0 \
            --auth-pam-helper-path=pam-helper \
            --auth-stay-signed-in-days=30 \
            --auth-timeout-minutes=0 \
            --rsession-path=/etc/rstudio/rsession.sh 
printf 'rserver exited' 1>&2

This final piece is where Singularity actually runs the RStudio server program in asoberan/abrfseurat using all of the configuration created earlier in the script.

Save this script somewhere on the cluster. Send it a compute node using Slurm. Then, read the contents of the Slurm output file and you'll receive instructions to port forward from your workstation in order to access RStudio at http://localhost:8787.

sbatch seurat_script.sh

cat slurm-<id>.out

# Follow instructions

Last updated