Building Docker Images
Docker is a container engine, but it's also an image build tool. You can build Docker images yourself by creating a Dockerfile, essentially a file that outlines each step in creating your image.
Below are the common commands used in a Dockerfile to outline these steps:
FROM
- Dictates what the base image you're building off of.LABEL
- A simple label attached to your image as metadata. A common label would bedescription
for writing a description of the image.RUN
- Runs the command you specify in the image. For example, if the base image is Ubuntu, then you can run any Ubuntu commands here. Common things to run would beapt-get install <package>
to install an Ubuntu package into your container.CMD
- The command that should run when the container is started. This tends to be the major software that is being packaged.
Knowing these is enough to build a simple Docker image. We'll be using this knowledge to build our own Docker image for Seurat.
Seurat is an R package designed for QC, analysis, and exploration of single-cell RNA-seq data. Seurat aims to enable users to identify and interpret sources of heterogeneity from single-cell transcriptomic measurements, and to integrate diverse types of single-cell data.
We'll use rocker/rstudio
as a base so that we can have RStudio available to us automatically.
Create a file named "Dockerfile".
First, we must select the base image. We'll use rocker/rstudio
version 4.3.2, which comes with R 4.3.2. We'll make sure to label the image with a simple description.
Then, we must outline the steps needed to install Seurat4. rocker/rstudio
is built on top of Ubuntu, so any packages we need to install should use Ubuntu's apt-get
utility. The following packages are needed for the installation of Seurat and other tools:
Now, we can run R to install Seurat and other useful R tools, including BiocManager
, which we'll use in the next step to install useful bioinformatics R libraries.
Installing R libaries using BiocManager
:
Installing other tools from GitHub:
All together, the Dockerfile should look like this:
Now that we have the Dockerfile, we can invoke the Docker build commands in the command line. We'll want to tag our Docker image with our name and the name of the image, preferably something descriptive. I'll choose asoberan/abrfseurat
for my build.
Of course, each of you could build this yourselves and have a custom local copy of this image. However, the benefits of containerization are that it makes programs and environments portable. I've already created the image and uploaded it to Dockerhub. So instead of everyone needing to create their own image, you just pull my existing image and use it immediately.
I've created images for both amd64 and arm64. If you're running a PC or an Intel-based Mac, you'll want to use the tag latest-x86_64
. If you're running Apple Silicon or another ARM processor, you'll want to use the tag latest-arm64
.
Once the Docker image is pulled and runs, you can navigate to http://localhost:8787 and login to the RStudio instance with user rstudio
and the given password. All the libraries needed for Seurat should be available out of the box.
However, we've fallen into the same problem as previously: we are running this instance of RStudio locally on our computers. How can we take advantage of this image on the Luria cluster?
Last updated