Building Docker Images
Docker is a container engine, but it's also an image build tool. You can build Docker images yourself by creating a Dockerfile, essentially a file that outlines each step in creating your image.
Below are the common commands used in a Dockerfile to outline these steps:
FROM- Dictates what the base image you're building off of.LABEL- A simple label attached to your image as metadata. A common label would bedescriptionfor writing a description of the image.RUN- Runs the command you specify in the image. For example, if the base image is Ubuntu, then you can run any Ubuntu commands here. Common things to run would beapt-get install <package>to install an Ubuntu package into your container.CMD- The command that should run when the container is started. This tends to be the major software that is being packaged.
Knowing these is enough to build a simple Docker image. We'll be using this knowledge to build our own Docker image for Seurat.
Seurat is an R package designed for QC, analysis, and exploration of single-cell RNA-seq data. Seurat aims to enable users to identify and interpret sources of heterogeneity from single-cell transcriptomic measurements, and to integrate diverse types of single-cell data.
We'll use rocker/rstudio as a base so that we can have RStudio available to us automatically.
Create a file named "Dockerfile".
First, we must select the base image. We'll use rocker/rstudio version 4.3.2, which comes with R 4.3.2. We'll make sure to label the image with a simple description.
FROM rocker/rstudio:4.3.2
LABEL description="Docker image for Seurat4"Then, we must outline the steps needed to install Seurat4. rocker/rstudio is built on top of Ubuntu, so any packages we need to install should use Ubuntu's apt-get utility. The following packages are needed for the installation of Seurat and other tools:
RUN apt-get update && apt-get install -y \
libhdf5-dev build-essential libxml2-dev \
libssl-dev libv8-dev libsodium-dev libglpk40 \
libgdal-dev libboost-dev libomp-dev \
libbamtools-dev libboost-iostreams-dev \
libboost-log-dev libboost-system-dev \
libboost-test-dev libcurl4-openssl-dev libz-dev \
libarmadillo-dev libhdf5-cpp-103Now, we can run R to install Seurat and other useful R tools, including BiocManager, which we'll use in the next step to install useful bioinformatics R libraries.
RUN R -e "install.packages(c('Seurat', 'hdf5r', 'dplyr', 'cowplot', 'knitr', 'slingshot', 'msigdbr', 'remotes', 'metap', 'devtools', 'R.utils', 'ggalt', 'ggpubr', 'BiocManager'), repos='http://cran.rstudio.com/')"Installing R libaries using BiocManager:
RUN R -e "BiocManager::install(c('SingleR', 'slingshot', 'scRNAseq', 'celldex', 'fgsea', 'multtest', 'scuttle', 'BiocGenerics', 'DelayedArray', 'DelayedMatrixStats', 'limma', 'S4Vectors', 'SingleCellExperiment', 'SummarizedExperiment', 'batchelor', 'org.Mm.eg.db', 'AnnotationHub', 'scater', 'edgeR', 'apeglm', 'DESeq2', 'pcaMethods', 'clusterProfiler'))"Installing other tools from GitHub:
RUN R -e "remotes::install_github(c('satijalab/seurat-wrappers', 'kevinblighe/PCAtools', 'chris-mcginnis-ucsf/DoubletFinder', 'velocyto-team/velocyto.R'))"All together, the Dockerfile should look like this:
FROM rocker/rstudio:4.3.2
LABEL description="Docker image for Seurat4"
RUN apt-get update && apt-get install -y \
libhdf5-dev build-essential libxml2-dev \
libssl-dev libv8-dev libsodium-dev libglpk40 \
libgdal-dev libboost-dev libomp-dev \
libbamtools-dev libboost-iostreams-dev \
libboost-log-dev libboost-system-dev \
libboost-test-dev libcurl4-openssl-dev libz-dev \
libarmadillo-dev libhdf5-cpp-103
RUN R -e "install.packages(c('Seurat', 'hdf5r', 'dplyr', 'tidyverse', 'cowplot', 'knitr', 'slingshot', 'msigdbr', 'remotes', 'metap', 'devtools', 'R.utils', 'ggalt', 'ggpubr', 'BiocManager'), repos='http://cran.rstudio.com/')"
RUN R -e "BiocManager::install(c('SingleR', 'slingshot', 'scRNAseq', 'celldex', 'fgsea', 'multtest', 'scuttle', 'BiocGenerics', 'DelayedArray', 'DelayedMatrixStats', 'limma', 'S4Vectors', 'SingleCellExperiment', 'SummarizedExperiment', 'batchelor', 'org.Mm.eg.db', 'AnnotationHub', 'scater', 'edgeR', 'apeglm', 'DESeq2', 'pcaMethods', 'clusterProfiler'))"
RUN R -e "remotes::install_github(c('satijalab/seurat-wrappers', 'kevinblighe/PCAtools', 'chris-mcginnis-ucsf/DoubletFinder', 'velocyto-team/velocyto.R'))"Now that we have the Dockerfile, we can invoke the Docker build commands in the command line. We'll want to tag our Docker image with our name and the name of the image, preferably something descriptive. I'll choose asoberan/abrfseurat for my build.
cd /path/to/directory/where/Dockerfile/is/located
docker build -t asoberan/abrfseurat .Of course, each of you could build this yourselves and have a custom local copy of this image. However, the benefits of containerization are that it makes programs and environments portable. I've already created the image and uploaded it to Dockerhub. So instead of everyone needing to create their own image, you just pull my existing image and use it immediately.
To upload your locally created images to Docker Hub, you'll first need to create an account on Docker Hub. Then, you'd run:
# Use your credentials to login
docker login
# Now that you are logged in, you can push the
# local image to your account
docker push asoberan/abrfseurat docker://docker.io/asoberan/abrfseurat:<tag>Here, you'd replace asoberan/abrfseurat with whatever you've named your local image, and replace <tag> with a version tag of your choosing. 0.0.1, for example.
Your image will be built for the CPU architecture of the machine on which you ran docker build. This means that if you ran docker build on a Mac with Apple Silicon, the image will be made for the aarch64 CPU architecture. Luria's CPU architecture is x86_64, also known as amd64, so you won't be able to run aarch64 images on it. Please keep this in mind when building Docker images.
I've created images for both amd64 and arm64. If you're running a PC or an Intel-based Mac, you'll want to use the tag latest-x86_64. If you're running Apple Silicon or another ARM processor, you'll want to use the tag latest-arm64.
docker run --rm -it -p 8787:8787 asoberan/abrseurat:<tag>Once the Docker image is pulled and runs, you can navigate to http://localhost:8787 and login to the RStudio instance with user rstudio and the given password. All the libraries needed for Seurat should be available out of the box.
However, we've fallen into the same problem as previously: we are running this instance of RStudio locally on our computers. How can we take advantage of this image on the Luria cluster?
Last updated
Was this helpful?
