SSH Port Forwarding Jupyter Notebooks

To install Jupyter Notebooks, it's best to create a Conda environment for it. Jupyter Notebooks is a fairly large piece of software, so Conda will have to calculate a lot of the environment's setup and then download a lot of packages. While this isn't necessarily resource-intensive, it's still a good idea to run this on a compute node.

srun --pty bash

module load miniconda3/v4

source /home/software/conda/miniconda3/bin/condainit

conda create --name jupyter_environment

conda activate jupyter_environment

conda install -c anaconda jupyter

Now that Jupyter has been installed and we're on a compute node, we should be able to start up a Jupyter Notebooks server by running the following:

jupyter notebook --ip=0.0.0.0 --port=12345

# This should output URLs to access Jupyter Notebooks from
# We will use the last URL
http://127.0.0.1:12345/?token=63e593c74248876b14c0d4299f454cb8ccd1f18725538c5e

What is happening here?

When a Jupyter Notebook server runs, it does two important things: it binds itself to an IP address, and it binds itself to a port.

By binding itself to an IP address, the server dictates what IP address it can be accessed at. The IP address it binds to will be inside of whatever IPs are available to the computer the server is running on.

However, many different services can run on a single computer, so services bind themselves to ports. When something tries to connect to the IP address of a computer, it must specify what port on that computer it wants to access. You can see this in the URL that Jupyter Notebooks gave me, as it has the URL 127.0.0.1 and the port 12345.

Jupyter Notebooks is now running and should be accessible at the specified URL + port. However, if we go to this URL in our computer's browser, we'll see that we aren't able to connect to it.

Why is this?

Jupyter Notebooks is running on one of the compute nodes. So when it binds to an IP address and port, it does so on that compute node's internal network. This internal network is not public, and can't be accessed outside of that same compute node. In fact, the IP address 127.0.0.1 (also known as localhost) is the "loopback address", AKA the computer referring to itself. So when we type this into our computers' web browsers, our computers are actually looking inside the port on their OWN network. Of course, our computer's aren't running Jupyter Notebooks, so we won't actually connect to anything.

How can we access this Jupyter Notebook server when it's running on a completely different private network?

Using SSH Port Forwarding

We're already using a tool that lets us create secure connections to another computer, SSH. We can leverage a feature built in to SSH to forward a remote computer's network to our local computer's network, SSH port forwarding.

However, remember the structure of our Luria cluster. We first SSH to the Luria head node, the compute nodes are not available for us to SSH into directly. From this head node, we only have access to the compute nodes by going through Slurm. However, Slurm has a feature that lets us SSH into a compute node as long as we have a job running on it, which in our case is true since we are running a Jupyter Notebooks instance on a compute node. Therefore, we will have to port forward twice, once to connect our computer's network to the head node, then a second time to connect that network to the compute node's network.

To test the SSH tunnels as we create them, we'll run a basic nc server on the Luria head node. All that this does is listen for connections on the port we specify. We'll then use SSH port forwarding to forward that port from the head node to our local computer's port so that it looks like the nc server is listening on our local computer's network.

On Luria, start the nc server on a random port, 23456 in this case:

nc -l 23456

On your local computer, start SSH port forwarding:

ssh -t <username>@luria.mit.edu -L 23456:localhost:23456

This commands links the port 23456 on the Luria head node to the port 23456 running locally on our computer.

To verify if SSH port forwarding is working correctly, open another terminal and use nc to send data to your computer's local 23456 port:

nc localhost 23456
Hello!

On Luria, you should see this:

nc -l 23456
Hello!

We've successfully made the first connection to the head node, now we must make the second connection from the head node to the compute node where Jupyter Notebooks is running. To do so, stop the SSH port forwarding command you just ran. We are going to slightly modify it for a Jupyter Notebooks server running on port 12345 on compute node c20.

On your local computer:

ssh -t <username>@luria.mit.edu -L 12345:localhost:12345 ssh c20 -L 12345:localhost:12345

We've appended another SSH port forwarding statement to the end of the first. The first part does the same thing as before, forwards the port 12345 on Luria to your local computer's port 12345. The second statement is run on Luria, and forwards the port 12345 on compute node c20 to Luria's port 12345.

Now, if we open our browsers and navigate to the URL that Jupyter Notebooks gave us earlier, we should see Jupyter Notebooks!

Full Jupyter Notebooks Example

On Luria:

# Connect to a compute node. Take note of which compute node
srun --pty bash

# Load the Conda module environment
module load miniconda3/v4

source /home/software/conda/miniconda3/bin/condainit

# SKIP CREATING THE CONDA JUPYTER ENVIRONMENT AND INSTALLING JUPYTER IF YOU HAVE AN EXISTING ENVIRONMENT
# IF YOU HAVE AN ENVIRONMENT ALREADY, SIMPLY ACTIVATE IT

# Create Conda environment for Jupyter
conda create --name jupyter_environment

# Activate newly created Jupyter Conda environment
conda activate jupyter_environment

# Install Jupyter
conda install -c anaconda jupyter

# Start Jupyter Notebooks. Choose a random 5-digit number for your port. Take note of the URL this gives you.
jupyter notebook --ip=0.0.0.0 --port=<port>

On local computer:

# SSH port forward
ssh -t <user>@luria.mit.edu -L <port>:localhost:<port> ssh <compute node> -L <port>:localhost:<port>

Open your web browser and navigate to the URL that Jupyter Notebooks gave you.

This process will look very similar for most software of this kind. Just run the software on a compute node and take note of the compute node and port that it is running on and adjust the SSH port forwarding as necessary.

Last updated

Massachusetts Institute of Technology