SSH Port Forwarding Jupyter Notebooks

To install Jupyter Notebooks, it's best to create a Conda environment for it. Jupyter Notebooks is a fairly large piece of software, so Conda will have to calculate a lot of the environment's setup and then download a lot of packages. While this isn't necessarily resource-intensive, it's still a good idea to run this on a compute node.

# Connect to a compute node. Take note of which compute node
srun --pty bash

# Load the Conda module environment
module load miniconda3/v4

source /home/software/conda/miniconda3/bin/condainit

# SKIP CREATING THE CONDA JUPYTER ENVIRONMENT AND INSTALLING JUPYTER
# IF YOU HAVE AN EXISTING ENVIRONMENT.
# IF YOU HAVE AN ENVIRONMENT ALREADY, SIMPLY ACTIVATE IT

# Create Conda environment for Jupyter
conda create --name jupyter_environment

# Activate the Jupyter Conda environment
conda activate jupyter_environment

# Install Jupyter
conda install -c anaconda jupyter

Now that Jupyter has been installed and we're on a compute node, we should be able to start up a Jupyter Notebooks server by running the following:

jupyter notebook --ip=0.0.0.0 --port=12345

# This should output URLs to access Jupyter Notebooks from
# We will use the last URL
http://127.0.0.1:12345/?token=63e593c74248876b14c0d4299f454cb8ccd1f18725538c5e

What is happening here?

When a Jupyter Notebook server runs, it does two important things: it binds itself to an IP address, and it binds itself to a port.

By binding itself to an IP address, the server dictates what IP address it can be accessed at. The IP address it binds to will be inside of whatever IPs are available to the computer the server is running on.

However, many different services can run on a single computer, so services bind themselves to ports. When something tries to connect to the IP address of a computer, it must specify what port on that computer it wants to access. You can see this in the URL that Jupyter Notebooks gave me, as it has the URL 127.0.0.1 and the port 12345.

Jupyter Notebooks is now running and should be accessible at the specified URL + port. However, if we go to this URL in our computer's browser, we'll see that we aren't able to connect to it.

Why is this?

Jupyter Notebooks is running on one of the compute nodes. So when it binds to an IP address and port, it does so on that compute node's internal network. This internal network is not public, and can't be accessed outside of that same compute node. In fact, the IP address 127.0.0.1 (also known as localhost) is the "loopback address", AKA the computer referring to itself. So when we type this into our computers' web browsers, our computers are actually looking inside the port on their OWN network. Of course, our computer's aren't running Jupyter Notebooks, so we won't actually connect to anything.

How can we access this Jupyter Notebook server when it's running on a completely different private network?

Using SSH Port Forwarding

We're already using a tool that lets us create secure connections to another computer, SSH. We can leverage a feature built in to SSH to forward a remote computer's network to our local computer's network, SSH port forwarding.

However, remember the structure of our Luria cluster. We first SSH to the Luria head node. The compute nodes are not available for us to SSH into directly. Slurm has a feature that lets us SSH into a compute node as long as we have a job running on it, which in our case is true since we are running a Jupyter Notebooks instance on a compute node. Therefore, we will have to port forward through two networks in order to complete the connection between our local computer's network and the network on the compute node.

On your local computer, run the following command. Make sure to fill in the appropriate username, compute node, etc. (Do not run this on the VSCode terminal if you are using VSCode. Open Windows Powershell / MacOS Terminal and paste the command in there)

ssh -t <username>@luria.mit.edu -L 12345:localhost:12345 ssh <compute node where your job is running> -L 12345:localhost:12345

Now, if we open our browsers and navigate to the URL that Jupyter Notebooks gave us earlier, we should see Jupyter Notebooks!

Full Jupyter Notebooks Example

On Luria:

# Connect to a compute node. Take note of which compute node
srun --pty bash

# Load the Conda module environment
module load miniconda3/v4

source /home/software/conda/miniconda3/bin/condainit

# SKIP CREATING THE CONDA JUPYTER ENVIRONMENT AND INSTALLING JUPYTER
# IF YOU HAVE AN EXISTING ENVIRONMENT.
# IF YOU HAVE AN ENVIRONMENT ALREADY, SIMPLY ACTIVATE IT

# Create Conda environment for Jupyter
conda create --name jupyter_environment

# Activate the Jupyter Conda environment
conda activate jupyter_environment

# Install Jupyter
conda install -c anaconda jupyter

# Start Jupyter Notebooks. Choose a random 5-digit number for your port.
# Take note of the URL this gives you.
jupyter notebook --ip=0.0.0.0 --port=<port>

On your local computer, run the following command. Make sure to fill in the appropriate username, compute node, etc. (Do not run this on the VSCode terminal if you are using VSCode. Open Windows Powershell / MacOS Terminal and paste the command in there).

# SSH port forward
ssh -t <user>@luria.mit.edu -L <port>:localhost:<port> ssh <compute node> -L <port>:localhost:<port>

Open your web browser and navigate to the URL that Jupyter Notebooks gave you.

This process will look very similar for most software of this kind. Just run the software on a compute node and take note of the compute node and port that it is running on and adjust the SSH port forwarding as necessary.

Last updated