Conda Environments
Per their website, Conda is a tool that "provides package, dependency, and environment management for any language." Essentially, Conda lets you create your own environments that contain the necessary pieces of software needed for you to run whatever program(s) or pipelines you need.
Creating and Activating an Environment
Conda is provided on Luria as a module environment, so to use it you'll first have to load in the module miniconda3/v4
.
When miniconda3
is loaded, you'll be asked to run:
Make sure to do so.
Now, the conda
program will be available to you.
To create a new conda environment, you'll first have to name it. It's typical to make a new environment for a particular task or pipeline, or for a single tool that requires being isolated. Name the environment accordingly. Once the environment is created, it can be activated.
What's happening here? When you create an environment, Conda creates a new directory in ~/.conda/envs
with the environment's name. This directory is where any packages and libraries that are installed via Conda will be placed. Activating a Conda environment will add this directory to your shell's environment, so that you can use any packages and libraries present in it as if they were installed to the system.
While the Conda environment is activated, you can use Conda to install packages to it. Any packages you tell it to install will be placed in the active environment directory.
Conda installs packages from what are called "channels". Channels are remote repositories that contain packages. Typical channels include anaconda
, conda-forge
, and bioconda
. Each channel contains its own set of packages, so it's best to know what channel the software you need is located at. You can search channels for software by running:
Or by simply looking it up online.
Once you are done using an environment, you can deactivate it. This is similar to unloading an environment module. If you ever need that environment again, you simply activate it and proceed to use the programs you installed to it previously.
The following is a real-world example of a good use-case for creating a Conda environment.
Let's say you want to use the program radian
, which provides a more modern R console experience than baseline R. However, there is no module available for radian
. According to radian's documentation, radian is located on the conda-forge
channel. Therefore, to make a Conda environment for radian
and install both it and R, you'd do the following:
Now, whenever you want to run radian
, you just load in Conda and activate the environment you created for it.
Declaratively Defining a Conda Environment
Instead of imperatively creating an environment, you can create a yaml file that describes the structure of your environment, such as your environment's name, what channels it will pull packages from, and what packages it needs, then have Conda create an environment from that file.
Defining an environment this way makes it easy to remember what packages you need for your use-case in case you need to recreate the environment in the future. It also makes it easy to share an environment with other researchers so that they can get up and running quickly.
Consider the following example: you need to run a pipeline that requires the following pieces of software with the corresponding version:
You can create the following yaml file called pipeline_example.yml
:
This yaml file details the name of the Conda environment, what channels it should install packages from, what packages need to be installed, and the Conda prefix directory.
Now, you can have Conda create the environment and activate it:
This saves you the trouble of creating the environment yourself then manually installing each package.
If you ever need to make changes to this environment, you can update the yaml file, then run:
Conda Environments in Slurm
Using your Conda environments in a Slurm script is very similar to using Environment Modules in a Slurm script. You just have to append the script with code to load in miniconda3, then activate the appropriate environment, like so:
Sharing Conda Environments
If you would like to share a virtual environment that you've created with others, it's important to export the environment first. In so doing, you protect yourself from any modifications the other user might make to your environment, and make that environment portable, so that they can copy it to their own directory, or build on top of it without affecting your work.
For more information, see the official Conda documentation here, and a more detailed guide from The Carpentries here.
Last updated