Running Your Applications with Ease
Welcome!
In this workshop, we will go over using containers on HPC resources, like UCLA’s Hoffman2
This is Part I of my workshop on Containers.
This presentation can be found on our GitHub page
WS_container.pdf
WS_container.qmd
Containers allow you to:
Containers allow for:
Image Ref: Hoffman2 source, Stampede2 source
Containers offer a lightweight, portable, and consistent environment across platforms. To fully grasp the concept of containers, it’s essential to understand virtualization.
Virtualization allows multiple operating systems to run simultaneously on a single physical machine. Each operating system operates as if it’s the only one running on the hardware. While containers share the same OS kernel, virtual machines have their own OS and resources.
GIF from https://giphy.com/
Software runs directly on OS from the physical hardware
Typical applications are in this fashion
module load
softwareApplications running inside of a VM are running on a completely different set of (virtual) resources
A “Machine” within a “Machine”
Applications running inside of a container are running with the SAME kernal and physical resources as the host OS
A “OS” within a “OS”
HPC resources (like Hoffman2) are SHARED resources 👥
Great for easily installing software with apt/yum 📦
Great if your software requires MANY dependencies that would be complex installing on Hoffman2. ⛓️
Podman 📦
Docker 🐳
Security considerations 🛡️
To use Apptainer on Hoffman2, simply load the module:
Common Apptainer commands:
apptainer exec [options] container.sif
apptainer exec mypython.sif python3 test.py
# Runs the command `python3 test.py` inside the container
Note
Apptainer will NOT run on Hoffman2 login nodes.
Create 🛠️
Transfer ↪️
Run ▶️
1. Create 🛠️
Transfer
Run
.sif
$H2_CONTAINER_LOC
Create
Transfer
Run ▶️
Run Apptainer on your container:
Or run as a Batch (qsub) job
Create job script myjob.job
So….
Turns into
You can find the workshop material here:
This example will use Tensorflow
We will use TensorFlow to train a model from this dataset
EX1
directorytf-example.py
IT DOESN’T WORK!!! Need tensorflow installed!!!
tensorflow_2.7.1.sif
Tip
apptainer shell [container.sif]
apptainer exec [container.sif] [command]
EX2
directory
pytorch_gpu.py
fileLet’s run python3 pytorch_gpu.py
on a GPU node
--nv
option.
In this example, we’ll run a parallel MPI container using NWChem, a popular computational chemistry application.
Many applications use MPI to run across multiple CPUs, and NWChem is one of them.
$H2_CONTAINER_LOC/h2-nwchem_7.0.2.sif
Typically, we will run NWChem like this:
module load intel/2022.1.1
module load nwchem/7.0.2
`which mpirun` -np 5 nwchem water.nw > water.out
To run inside the container:
mpirun
in front of apptainer exec
qrsh -l h_data=10,arch=intel-gold* -pe shared 5
module load intel/2022.1.1
`which mpirun` -np 5 apptainer exec $H2_CONTAINER_LOC/h2-nwchem_7.0.2.sif nwchem water.nw > water.out
A example batch job is located in EX3/nwchem.job
Questions? Comments? 🤔
Charles Peterson cpeterson@oarc.ucla.edu