Using Containers on HPC Resources

Running Your Applications with Ease

Charles Peterson

Learning Objectives 🎯

Welcome!

In this workshop, we will go over using containers on HPC resources, like UCLA’s Hoffman2

  • Understand the basics of containers 📚
  • Used in HPC environments 💻
  • This Workshop will show:
    • Basics of containers
    • Virtualization concepts
    • Give practical example

Files for this Presentation 📁

This presentation can be found on our GitHub page

  • Viewing the slides
  • To download the presentation and example files, run the following command (this will download the files from GitHub):
git clone https://github.com/ucla-oarc-hpc/WS_containers

Containers: The Basics

Containers: The Basics 📦

What Are Containers?

Containers are a powerful way to install and run scientific software.

  • Consistency ✔️
    • Software runs the same way, regardless of where the container is executed.
  • Isolation 🔒
    • Containers do not interfere with other containers or with the host, ensuring a secure execution environment.
  • Lightweight and portable ✈️
    • Same containers can be easily transferred between computers, HPC systems, or cloud providers.
  • Installing software 🛠️
    • Easily install and manage complex scientific software

Containerizing Software 🛠️

Containers allow you to:

  • Package applications along with all their dependencies, configurations, libraries, and binaries. This comprehensive packaging ensures that the application runs consistently everywhere.
  • Easily deploy and run them across different systems, facilitating scalability and flexibility.

Transferring Containers 🚚

Containers allow for:

  • Easy transfer between different HPC resources
  • Ensure consistent environment for your software

Traditional Installation 🏗️

Typically, to use your software on Hoffman2, you need to:

  • Transfer code to Hoffman2
# From Github
git clone https://github.com/charliecpeterson/mysoftware
# From a website
wget https://www.mysoftware.com/software.tar.gz
# Copy code from other machine
scp mysoftware.tar.gz hoffman2.idre.ucla.edu
  • Load Required Modules
module load gcc/10.2.0
module load intel/2022.1.1
  • Compile Your Software
cmake .. -DCMAKE_INSTALL_PREFIX=$HOME/myapps
make ; make install
  • Update software’s environment
export PATH=$HOME/myapps/bin:$PATH
export $LD_LIBRARY_PATH=$HOME/myapps/lib:$LD_LIBRARY_PATH

Caution

Challenges with Installing Software 🛠️

  • Researchers face difficulties in managing software installations:
    • Spend time setting up software on Hoffman2
      • Figuring out how to compile
      • Installing dependencies
    • Having to wait for System Admin help
    • Then start all over when using software on a different HPC resource
  • HPC resources (like Hoffman2) are SHARED resources 👥
    • Researchers are running software on the same computing resource
    • No ‘sudo’ and limited yum/apt-get commands available 🚫

Containers vs. Traditional Install ⚖️

  • Traditional install
    • Software dependencies must be installed on the host system. 📁
    • Conflicts can occur between different software versions. ⚠️
    • Challenging to achieve consistent environments across platforms. 📉
  • Containerization
    • Dependencies are packaged within the container. 🎁
    • No conflicts with the host system or other containers. ☮️
    • Consistent and reproducible environments on any platform. 📈

Container Advantages

  • Bring your own OS 🌎
    • Portability ✈️
    • Reproducibility 🔁
    • Design your own environment 🎨
  • Install your application once:
    • Use on any HPC resource 🌐

  • Easily install software with apt/yum 📦

  • Great if software requires MANY dependencies that would be complex installing on Hoffman2. ⛓️

Understanding Virtualization 🖥️

Before diving into containers, it’s important to first understand virtualization, the technology that paved the way.

  • What is Virtualization?
    • Allows multiple isolated environments to run on a single physical machine.
    • Improves resource efficiency by sharing hardware among different virtual machines.
    • Forms the foundation for modern containerization.

Matrix GIF

Types of Virtualization 📐

  1. Hardware Virtualization - Virtual Machines (VMs)
    • Creates virtual machines with independent OS and resources on a single physical host.
    • Ideal for running different operating systems or when complete OS isolation is required.
    • Example: VirtualBox, VMware, AWS EC2
  2. Operating System Virtualization - Containers ✅
    • Allows multiple isolated user-space instances on the same OS kernel.
    • Efficient and lightweight, suitable for microservices and scalable applications.
    • Example: Docker, Apptainer, Kubernetes
  3. Application Virtualization
    • Packages applications and their dependencies for execution on any compatible system.
    • Perfect for deploying apps without worrying about system compatibility or installing dependencies.
    • Example: App-V, ThinApp, Turbo

Bare Metal Setup: No Virtualization 💻

  • ‘Bare metal’ refers to physical servers running directly on hardware without virtualization. 🔧
    • Similar to running software directly on your laptop
  • Software is installed directly on the host operating system. 💿
  • Uses the pysical hardware such as CPU, memory, and storage 📊
  • ✨ Advantages: High performance, direct access to hardware, low overhead. 👍
  • ⚠️ Limitations: Less flexibility, limited isolation between applications, potential underutilization of resources. 👎
  • Software runs directly on OS from the physical hardware

  • Typical applications are in this fashion

    • Most module load software

Virtual Machines (VMs): Hardware-Level 🖥️

  • VMs emulate physical computers and run multiple operating systems on a computer.
  • Each VM has its own ‘virtual’ hardware, including CPU, memory, and storage. 💾
  • VMs are managed by a hypervisor (e.g., VirtualBox, VMware) that abstracts the physical hardware. 🎛️
  • VMs provide isolation between environments 🛡️


  • Applications running inside of a VM are running on a completely different set of (virtual) resources

  • A “Machine” within a “Machine”

OS Virtualization: Containers 🐳

  • OS virtualization with containers allows multiple, isolated user-space instances to run on a single host OS.
  • Containers share the host OS kernel but have their own file system, libraries, and dependencies.
  • Containerization provides a consistent and reproducible environment across platforms.


  • Applications running inside of a container are running with the SAME kernal and physical resources as the host OS

  • A “OS” within a “OS”

Software for Containers 🔧

Docker 🐳

  • Popular containerization software
  • Many popular cloud container registries to store Docker containers:
    • DockerHub, GitHub Packages, Nvidia NGC
  • MPI over multiple servers not well supported 🚫
  • Most likely NOT available on many HPC systems (not on Hoffman2)

Podman 📦

  • Similar syntax as with Docker
  • Doesn’t have root daemon processes
  • On some HPC resources (not on Hoffman2, yet) 🔜

Apptainer

Apptainer 🚀

  • Formerly Singularity
  • Designed and developed for HPC systems 🖥️
  • Most likely installed on HPC systems (installed on Hoffman2)
    • Possible to even install it yourself
  • Supports Infiniband, GPUs, MPI, and other devices on the Host
  • Can run Docker containers 🐋

Security considerations 🛡️

  • Built with shared user system environments in mind
  • NO daemon run by root 🚫
  • NO privilege escalation. Cannot gain control over host/Hoffman2 🔒
  • All permission restrictions outside of a container apply to the inside 🔐

Common Usage on Hoffman2 💡

To use Apptainer on Hoffman2, simply load the module:

module load apptainer
  • Only module you need to load, not matter the software in the container
    • Except for a MPI module if running parallel
module load apptainer
module load intel/2022.1.0

Common Apptainer Commands:

  • Getting a container from somewhere
apptainer pull [options]
apptainer pull docker://ubuntu:20.04
  • Build a container
apptainer build [options]
apptainer build myapp.sif myapp.def

Common Usage Continued 🔧

Common Apptainer commands:

  • Run a command within a container
    • Runs the command python3 test.py inside the container


apptainer exec [options] container.sif
apptainer exec mypython.sif python3 test.py
  • Run the container with a prefinded runscript
apptainer run container.sif
  • Start an interactive session inside your container
    • You can interact and run commands inside the container


apptainer shell [options] container.sif
apptainer shell mypython.sif

Note

Apptainer will NOT run on Hoffman2 login nodes.

Apptainer Workflow for running on H2 🔄

  1. Create 🛠️

  2. Transfer ↪️

  3. Run ▶️

Apptainer Workflow (Create) 🛠️

1. Create 🛠️

  1. Transfer

  2. Run

  • Build a container
    • From Apptainer or Docker on your computer
    • Where you have root/sudo access
    • Typically, Apptainer containers end in .sif
  • Use a pre-built container:

Apptainer Workflow (Transfer) ↪️

  1. Create

2. Transfer ↪️

  1. Run

Bring your container to Hoffman2:

  • Copy your container to Hoffman2
scp test.sif username@hoffman2.idre.ucla.edu
  • Pull a container from Container Register
apptainer pull docker://ubuntu:20.04
  • Use a container pre-built on Hoffman2
module load apptainer
ls $H2_CONTAINER_LOC

Apptainer workflow (Run) ▶️

Create

Transfer

Run ▶️

Run Apptainer on your container:

  • Can run in an interactive (qrsh) session
qrsh -l h_data=20G
module load apptainer
apptainer exec mypython.sif python3 test.py
  • Or run as a Batch (qsub) job

  • Create job script myjob.job

#!/bin/bash
#$ -l h_data=20G
module load apptainer
apptainer exec mypython.sif python3 test.py
  • Submit your job
qsub myjob.job

MAJOR TAKEWAY

  • Apptainer containers run like any other application.
  • Run the same commands as you normally would
    • Just add an Apptainer command to any command you want to run inside the container

So….

python3 test.py
R CMD BATCH test.R

Turns into

apptainer exec myPython.sif python3 test.py
apptainer exec myR.sif R CMD BATCH test.R

Examples

  • Example 1: Simple containers with TensorFlow
  • Example 2: GPU containers with PyTorch
  • Example 3: Parallel MPI containers

You can find the workshop material here:

git clone https://github.com/ucla-oarc-hpc/WS_containers

Example 1: TensorFlow (1) 🧠

  • This example will use Tensorflow

    • Great library for developing Machine Learning models
  • We will use the MNIST dataset
    • Data of over 60,000 training images of handwritten digts

We will use TensorFlow to train a model from this dataset

Example 1: TensorFlow (2)

  • Go to EX1 directory
  • Look at tf-example.py
    • This example uses TF to train from the MINIST data

Normally, to run this job, we will run

module load python
python3 tf-example.py

IT DOESN’T WORK!!! Need tensorflow installed!!!

  • You can install it your yourself (via pip/conda maybe?)
    • Maybe errors with building
    • Have to build again using another computer

Example 1: TensorFlow (3)

Interactive

  • Start an interactive session
qrsh -l h_data=20G
  • Load the apptainer module
module load apptainer
  • Pull the TF container from DockerHub
apptainer pull docker://tensorflow/tensorflow:2.7.1
  • We see a file named, tensorflow_2.7.1.sif
    • This SIF file is the container
    • This container will have an Operating System with Python and TensorFlow already installed inside

Example 1: TensorFlow (4)

  • Start an interactive shell INSIDE the container
apptainer shell tensorflow_2.7.1.sif
  • Now we are in the container, we can run python with TensorFlow!
python3 tf-example.py

Tip

  • See that we didn’t need to load any python module!
  • This Python is resides in the container
  • We didn’t need to install any TensorFlow packages ourselves!!

Example 1: TensorFlow (5)

Batch

  • Going interactively inside the container (Previous slide)
    • apptainer shell [container.sif]
  • Run a single command in the container
    • apptainer exec [container.sif] [command]
qrsh -l h_data=20G
module load apptainer
apptainer pull docker://tensorflow/tensorflow:2.7.1
apptainer exec tensorflow_2.7.1.sif python3 tf-example.py

Alternatively, you can submit this as a batch job

  • Example job script: tf-example.job
qsub tf-example.job

Example 2: GPUs with PyTorch (1) 🎆

  • This example uses PyTorch with GPU support for faster speed 🚀
    • Another great Machine Learning framework
  • Go to the EX2 directory
    • Examine the pytorch_gpu.py file
    • Optimize a 3rd order polynomial to a sine function
  • To run this example, we’ll need to find a container with GPU support!

Example 2: GPU job (2)

Let’s run python3 pytorch_gpu.py on a GPU node

  • Start an interactive session with a GPU compute node
qrsh -l h_data=20G,gpu,V100,cuda=1
  • Download the PyTorch container from Nvidia NGC
module load apptainer
apptainer pull docker://nvcr.io/nvidia/pytorch:22.03-py3
  • Run apptainer with the --nv option.
    • This enables the container to use the host’s GPU drivers
apptainer shell --nv pytorch_22.03-py3.sif
python3 -c "import torch; print(f'GPU is available: {torch.cuda.get_device_name(0)}' if torch.cuda.is_available() else 'GPU is NOT available')"
  • Run python3 as a single command
apptainer exec --nv pytorch_22.03-py3.sif python3 pytorch_gpu.py

Alternatively, you can submit this as a batch job using a job script

qsub pytorch_gpu.job

Example 3: Parallel MPI container ⚙️

You can run MPI parallel software inside of containers. This example will compile and run a simple MPI code with a MPI-enabled container.

Commonly, containers for MPI software that you would find (or build) will have MPI installed inside the container.

You will also need to have a MPI build outside the container as well.

This Hybrid apporah will have the MPI build inside the container rely on the MPI implementation available outside on the host system

Example 3: Running Parallel code 🌐

  • Load both the apptainer and intel modules
    • The intel (oneAPI) module will load IntelMPI on the outside host
qrsh -l h_data=10,arch=intel-gold* -pe shared 3
module load apptainer
module load intel/2022.1.1
  • Pull oneAPI HPC KIT from dockerhub
    • Container with IntelMPI compilers, libraries, and tools for HPC
    • Creates oneapi-hpckit_2025.0.2-0-devel-ubuntu24.04.sif container file
apptainer pull docker://docker.io/intel/oneapi-hpckit:2025.0.2-0-devel-ubuntu24.04
  • Compile MPI code
    • Run mpiicx -o myMPIcode.x myMPIcode.c inside the container
apptainer exec oneapi-hpckit_2025.0.2-0-devel-ubuntu24.04.sif mpiicx -o myMPIcode.x myMPIcode.c
  • Execute MPI Code
    • Run ./myMPIcode.x inside the container
apptainer exec oneapi-hpckit_2025.0.2-0-devel-ubuntu24.04.sif  ./myMPIcode.x

Example 4: MPI with NWChem 🔋

In this example, we’ll run a parallel MPI container using NWChem, a popular chemistry software.

We will use a container with NWChem, built with MPI to run across multiple CPUs.

Typically, we will run NWChem like this:

module load intel/2022.1.1
module load nwchem/7.0.2
`which mpirun` -np 5 nwchem water.nw > water.out
  • On Hoffman2, a NWChem container with MPI has already been built
    • $H2_CONTAINER_LOC/h2-nwchem_7.0.2.sif

Note

  • Location of Hoffman2 collected containers
    • echo $H2_CONTAINER_LOC

Example 4: Running NWChem

Interactive Job

To run inside the container:

  • Request multiple cores with qrsh
qrsh -l h_data=10,arch=intel-gold* -pe shared 5
  • Load the apptainer and intel module
    • Sets up INTELMPI on the host (outside the container)
module load apptainer
module load intel/2022.1.1
  • Run NWChem software
    • Add mpirun -np 5 in front of apptainer exec
mpirun -np 5 apptainer exec $H2_CONTAINER_LOC/h2-nwchem_7.0.2.sif nwchem water.nw  > water.out

Batch Job

  • A example batch job is located in EX4/nwchem.job
qsub nwchem.job

Considerations and Best Practices

  • 📦 Size of container
    • Keep it small and minimal
    • Include only necessary components for your applications
    • Large containers need more memory and take longer to start up
  • 👥 Share .sif files with your friends!
    • 🔧 Experiment creating your containers
    • Save your (Docker) containers to DockerHub or GitHub Packages
    • Find examples of Dockerfiles and Apptainer def files on our GitHub

Workshop Summary 🚀

  • 🔹 What We Learned
    • ✅ Traditional Software Installation vs. Containers
    • ✅ Why Containers?
      • Portability, Reproducibility, Ease of Use
    • ✅ Understanding Virtualization & HPC Challenges
    • ✅ Apptainer Basics & Running Containers on Hoffman2
    • ✅ Practical Examples:
      • TensorFlow, PyTorch (GPU), MPI
  • 💡 Key Takeaways
    • 🔹 Containers simplify software installation & execution
    • 🔹 Use Apptainer to run pre-built software on HPC
    • 🔹 Bring Your Own OS & Dependencies
      • No Admin Required
    • 🔹 Share & Reuse Containers to Save Time.
    • 🔹 Next Step: Build Your Own Containers!

Thank you! ❤️

Questions? Comments? 🤔

Charles Peterson