Luria Cluster

Luria.mit.edu is a Linux cluster built in 2017, with the intent of providing computing resources for life sciences workloads for the Koch Cancer Research Institute and affiliated labs and DLCs at MIT.

It has 2272 cores, across 57 nodes:

Nodes
RAM

c1-9

128GB

c10

192GB

c11-40

96GB

b1-16

768GB or 384GB

Head node

64GB

Nodes
CPU Cores
CPU Model

c1-4

16 cores

Intel(R) Xeon(R) CPU E5-2650 v2 @ 2.60GHz

c5-40

8 cores

Intel(R) Xeon(R) CPU E5620 @ 2.40GHz

b1-16

48 cores

Intel(R) Xeon(R) Gold 6240R CPU @ 2.40GHz or 5220R CPU @ 2.20GHz

Head node

16 cores

Intel(R) Xeon(R) CPU E5-2620 v4 @ 2.10GHz

Accessing Luria

To request a Luria account, please email to luria-help@mit.edu from your MIT email account with the following information:

  • Your lab affiliation.

  • What kind of work you are looking to accomplish on the Luria cluster.

  • Whether you have previous experiences in Linux and cluster computing.

Connecting to Luria

You must be connected to the MIT VPN if you are not connected to MITNet or the MIT Secure Wi-Fi.

Visual Studio Code

To connect to Luria using Visual Studio Code, follow the instructions at this website. When you reach the step "Connect to Host", use <your-kerberos-username>@luria.mit.edu as your host. Enter your Kerberos password when prompted.

You can then open any folder or workspace on Luria using by going to File > Open and edit it straight in Visual Studio Code.

After successfully connecting, go to the VS Code top bar, press Terminal, then New Terminal as shown here. This will open a remote shell on the Luria cluster to run commands on.

SecureCRT

SecureCRT is available to download from IS&T's website. To create a connection using SecureCRT, click Quick Connect, then fill in the following fields:

Enter your Kerberos password when prompted.

Windows / Linux/ MacOS Terminal

The Windows Terminal is available to download from the Microsoft app store. If you do not wish to download it, you can use the built-in Windows PowerShell to run the following commands. Linux and MacOS come with a terminal and SSH client built-in.

  • Run: ssh <your-kerberos-username>@luria.mit.edu

  • Enter your MIT Kerberos password when prompted.

Data Storage

Each user on Luria will have at least two storage areas: a home directory and a project directory. Both are backed up daily.

Home Directory

Each user has a home directory. The home directory is intended to save commonly used scripts, environment configuration files, and your own software. A standard user account has a storage quota of 10GB in home directory. Please do not store large data file under your home directory as it can fill up the space quickly.

Project Directory (~/data)

In your home directory, there is usually a symbolic link called data which points to your storage servers. A symbolic link (symlink) is a file that points to a different location. Symbolic links are useful for saving space, since instead of having a large directory stored into a location, such as your luria home, you can use a symbolic to point to that directory on a resource with more abundant storage.

Programs will automatically follow the symbolic link and behave as if it was actually the directory. Symlinks also serve as shortcuts to different locations in the filesystem. For example, the symbolic link called data in your homes point to your directory in your lab's storage share. This is probably one of the following:

  1. /home/<username>/data -> /net/bmc-lab2/data/lab/<lab-name>/<username>

  2. /home/<username>/data -> /net/bmc-pub14/data/<lab-name>/users/<username>

Please do not delete the data symbolic link. The project directory is intended to save all your data which resides on either bmc-pubX or bmc-labX storage servers.

Your home directory also has five hidden directories: .local, .R, .singularity, .conda, and .jupyter. These directories are specifically named and required by their respective programs to store files, configurations, images, and packages. Since these can grow quite large, these directories are linked to your storage server.

Starting in February 2024, these directories point to the following hidden directories:

  • /home/<username>/.singularity -> /home/<username>/data/.singularity

  • /home/<username>/.conda -> /home/<username>/data/.conda

  • /home/<username>/.jupyter -> /home/<username>/data/.jupyter

  • /home/<username>/.local -> /home/<username>/data/.local

  • /home/<username>/.R -> /home/<username>/data/.R

For older accounts, these directories may look somewhat different, for example:

  • /home/<username>/.singularity -> /home/<username>/data/singularity

  • /home/<username>/.conda -> /home/<username>/data/conda

  • /home/<username>/.jupyter -> /home/<username>/data/jupyter

  • /home/<username>/.local -> /home/<username>/data/local

  • /home/<username>/.R -> /home/<username>/data/R

Disk quota exceeded

If your home directory exceeds the 20G quota, you will receive the error "Disk quota exceeded" when writing to your home directory. If this happens, you should examine which directories/files take most of the space using the du command, and then move those large directories/files to your project directory ~/data located on the storage server.

$ du -shc $(ls -A)

Data Storage Cleanup and Compression

While computing on Luria is free, data storage is charged at $100/TB/year. To reduce the operational cost and better utilize the storage server, we recommend you check your data periodically - removing data that are no longer needed and compressing files that you will not need for a while. Here is an example command to compress all fastq files in the current directory and sub-directories that are older than 180 days. Please remember running it through either a batch script or interactive login to a compute node.

$ find . -name \*.fastq -mtime +180 | parallel -j16 gzip -v {}

Data Storage Archival

Each lab has a storage quota in the project directory. When the storage quota is exceeded for your lab, you will not be able to add more data to your storage server. As mentioned above, you will then need to do data storage cleanup and compression to free up spaces. In addition, you can send a request to luria-help@mit.edu to increase your storage quota with a cost object provided and/or do TSM archival of your folder(s).

We don't offer tiered storage. We don't use the term cold storage as our all storage servers are active. If you have data that you don't need for a while, we can do TSM archive on MIT storage and then delete the data from our storage server. Archived data is stored in MIT TSM archive, and is no longer accessible from our cluster luria.mit.edu as well as our storage servers. If you need the data back available to luria and our storage servers, we can retrieve it from MIT TSM archive. The MIT archive is a free service, but will require two hours of our labor to retrieve it at a rate of $90/hr.

Transferring data to and from Luria

Using SCP

  • Secure Copy or SCP is a means of securely transferring computer files between a local and a remote host or between two remote hosts.

  • It is based on the Secure Shell (SSH) protocol.

  • The term SCP refers both to the protocol and the program.

  • The command line scp is one of the most common SCP program and implements the SCP protocol to transfer files.

  • The syntax of the scp command is the same as the syntax of the cp command:

Copying file(s) to luria

# copying file `myfile.txt` from your local machine
# to your home directory in luria
scp myfile.txt <user>@luria.mit.edu:~/myfile.txt

# copying folder `myfolder` from your local machine
# to your home directory in luria
scp -r myfolder <user>@luria.mit.edu:~/

Copying file(s) from luria

# copying file `myfile.txt` from luria to
# your local machine
scp user@luria.mit.edu:~/myfile.txt myfile.txt

Transferring Files Using Jupyter Notebook

You can upload and download files from the Jupyter notebook interface via your browser. Please follow the instructions at this page.

Software Packages

module

Luria uses module to manage software and its versions, (see Managment of Software Packages with module). Please note that not all software/versions available on rous are installed on luria. To get a list of software packages and versions, run

module avail

While module helps you to locate software that has been loaded into LMOD, many packages installed on Luria are not configured in this way, or may be included with other packages. If a package that you are looking for doesn't show up using module , you can run the locate command to search the filesystem for packages by name.

For example, to find out if tabix is installed, run the command locate tabix. The result should show that the command tabix is available under samtools. You can then run module load samtools/1.3 to have tabix in your environment.

Software Installation

See Installation.

Running jobs

Slurm

Luria uses the Slurm scheduler in order to manage jobs. In order to submit a job to run on the cluster compute nodes, you will need to create a batch script for Slurm and then submit it to the scheduler. For more information, please see our separate article on Slurm.

Training Session

The Integrated Genomics and Bioinformatics core at MIT (IGB) is offering a hands-on introductory session covering Linux usage and cluster computing using KI/BioMicro computational resources. This training session is targeted to users who are new to Linux and/or cluster computing. It is currently offered every 6 weeks, and registration fee is managed using iLabs. If you are interested in the next training session or one-on-one training, please email luria-help@mit.edu for details.

Last updated

Massachusetts Institute of Technology