Using Docker Containers with Podman on ERISXdl

ERISXdl Containers

The management of Docker Containers and images on the login nodes of ERISXdl can be conveniently performed by a tool called podman. The podman program provides access to a significant portion of the Docker container API without requiring root privileges for podman commands to run. Researchers will be able to pull and use containers built and distributed by their colleagues or docker registries, like dockerhub.io, for the purpose of running GPU-based analysis on the ERISXdl GPU-nodes.

The GPU-nodes have no direct connection the internet so they will not be able to run code which requires internet access. Researchers will need to prepare/update their containers and code before jobs are submitted to the gpu-nodes for analysis. Computational jobs should not be run on the login nodes, and should be submitted through the SLURM scheduler. For more information on SLURM and using containers in submitted jobs, see the Using SLURM Scheduler article.

Images to be run on the compute nodes need to be pushed to the Harbor registry and which is hosted at erisxdl.partners.org. Each research group is provided with their own project space in the Harbor registry service whose name corresponds to the lowercase form of their briefcase PAS group name. Group members can login to the website erisxdl.partners.org to examine their group's space by using their Partners user-id and password. By default all project accounts are initially allocated 50GB of storage and where this can be expanded by submitting a request to @email

For all ERISXdl/SLURM jobs both the user's home directory and group briefcase are mounted by default into the runtime Docker container so that convenient access to research data is made possible. This is achieved by means of wrapper scripts which are referenced in "Using SLURM Job Scheduler".

Containers Provided by ERIS HPC Team

Sample containers are provided in the Harbor Registry at the following locations:

  • erisxdl.partners.org/library : containers providing GUI interactive sessions e.g. for JupyterHub and Cryosparc
  • erisxdl.partners.org/nvidia : several curated NVIDIA NVCR containers such as TensorFlow and CUDA

In future we hope to expand the range of containers offering GPU-powered GUI interactive sessions and are open to suggestions in this regard. In terms of implementation, each type of session, JupyterHub, Cryosparc etc. has a corresponding wrapper script that is invoked in a SLURM batch job. The wrapper script will then generate a custom URL that is output to the job log file and which is 'live' for the duration of the SLURM job. The link should be accessible by most modern web browsers. More details can be found in the test cases examined in the ERISXdl/SLURM section.

Regarding containers stored at erisxdl.partners.org/nvidia, these are generally old and intended for demonstration purposes only. Instead the user is advised to obtain the latest CUDA run-time binaries from NVIDIA's Container Registry for which the HPC ERIS team has purchased a subscription. The images in this registry are optimized for use on ERISXdl's dgx compute nodes. These images are proprietary however and are not authorized to be distributed outside of the MGB network. Any attempt to do so could result in the termination of services for all the MGB research groups who rely upon these images for their work.

How-to Manage Container Examples

Example 1: Pulling container images from outside registries

Users can pull in container images from registries outside of ERISXdl/Harbor such as DockerHub. Login may be necessary for different registries, and may require an account for that registry.  Once logged in, you will then be able to pull a container image from the registry, tag the image as your own copy, and push that copy to your Harbor project. To view all the images you currently have available on your local storage, run the podman images command. This does not reflect the container images you may have in your Harbor project.

For example, the steps below show how an alpine Linux container would be pulled from DockerHub and stored in the hypothetical 'abc123' username's Harbor project. But please bear in mind that since the end of the pilot phase there are in fact no individual Harbor accounts, only group accounts of the form <PAS Group Name in lowercase>.

  1. Login to the registry/registries you are pulling from and pushing to

    Note: your login credentials for the ERISXdl Harbor registry should be the same as your cluster credentials.

    $ podman login docker.io

    Username: abc123
    Password: *************
    Login Succeeded!

    $ podman login erisxdl.partners.org

    Username: abc123
    Password: *************
    Login Succeeded!

  2. Search for a container


    $ podman search docker.io/alpine

    INDEX       NAME                                             DESCRIPTION                                       STARS   OFFICIAL   AUTOMATED
    docker.io   docker.io/library/alpine                         A minimal Docker image based on Alpine Linux...   7670    [OK]
  3. Pull the container image


    $ podman pull docker.io/library/alpine

    Trying to pull docker.io/library/alpine...
    Getting image source signatures
    Copying blob 5843afab3874 done
    Copying config d4ff818577 done
    Writing manifest to image destination
    Storing signatures
    d4ff818577bc193b309b355b02ebc9220427090057b54a59e73b79bdfe139b83
    $ podman images

    REPOSITORY                                         TAG        IMAGE ID       CREATED        SIZE
    docker.io/library/alpine                           latest     d4ff818577bc   4 weeks ago    5.87 MB
  4. Tag the container image*


    $ podman images

    REPOSITORY                                         TAG         IMAGE ID       CREATED        SIZE
    docker.io/library/alpine                           latest      d4ff818577bc   4 weeks ago    5.87 MB
    erisxdl.partners.org/ic876/alpine                  demo-copy   d4ff818577bc   4 weeks ago    5.87 MB

    For the alpine example, we are tagging the alpine image with demo-copy

    $ podman tag d4ff818577bc erisxdl.partners.org/abc123/alpine:demo-copy

    and if we wish to tag with the PAS Group account we would use

    $ podman tag d4ff818577bc erisxdl.partners.org/<PAS Group Name in lowercase>/alpine:demo-copy

    for example

    $ podman tag d4ff818577bc erisxdl.partners.org/phs-erisadm-g/alpine:demo-copy

    *Ideally, tag your image to reflect its current version.

  5. Push the container image 

    $ podman push erisxdl.partners.org/abc123/alpine:demo-copy
    or, for the PAS Group account

    $ podman push erisxdl.partners.org/<PAS Group Name in lowercase>/alpine:demo-copy

    Once it is successfully pushed to your Harbor project, you can now pull your copy to your podman runtime at any time, as well as access it in scripts submitted to the job scheduler.

    Optional (and at user's own risk): to confirm that it was pushed successfully, remove the locally stored image (this will not affect your Harbor project) and pull it again.

    $ podman rmi -f d4ff818577bc

    $ podman pull erisxdl.partners.org/abc123/alpine:demo-copy

Example 2: Pulling provided containers from Harbor

Once in full production, ERISXdl users will be able to choose from several curated, pre-built containers provided through Harbor. In the following example, the hypothetical ‘abc123’ username pulls the public CUDA image and stores a copy of it in their Harbor project. (Please note, this CUDA image is very old and intended for demo purposes only, newer versions of CUDA are available at the NVIDIA catalog.)

  1.  Login to Harbor

    Note: your login credentials for the ERISXdl Harbor registry should be the same as your cluster credentials.
    $ podman login erisxdl.partners.org

    Username: abc123
    Password: *************
    Login Succeeded!
  2.  Pull the container image from Harbor

    Note: depending on the size of the container, this step may take several minutes


    $ podman pull erisxdl.partners.org/nvidia/cuda
    $ podman images

    REPOSITORY                         TAG      IMAGE ID       CREATED       SIZE
    erisxdl.partners.org/nvidia/cuda   latest   979cd1f9e2c8   2 weeks ago   4.24 GB
  3.  Tag the container


    $ podman tag 979cd1f9e2c8 erisxdl.partners.org/abc123/cuda:latest

    and if we wish to tag with the PAS Group account we would use 

    $ podman tag 979cd1f9e2c8 erisxdl.partners.org/<PAS Group Name in lowercase>/cuda:latest

    $ podman images

    REPOSITORY                         TAG      IMAGE ID       CREATED       SIZE
    erisxdl.partners.org/nvidia/cuda   latest   979cd1f9e2c8   2 weeks ago   4.24 GB
    erisxdl.partners.org/abc123/cuda   latest   979cd1f9e2c8   2 weeks ago   4.24 GB
  4.  Push the container


    $ podman push erisxdl.partners.org/abc123/cuda:latest

    or, for the PAS Group account

    $ podman push erisxdl.partners.org/<PAS Group Name in lowercase>/cuda:latest


Example 3: Running and customizing containers

One of the key features in using containers is the user who runs the container has root permissions inside of the running image. This means that users can run package managers and make system changes freely within their container. To save changes you make to a container, you will need to run the container image, make modifications, and then commit those changes with podman before you push the latest version to your Harbor project.

Note: some containers have extra security layers that prevent users from making certain changes even with root permissions. This may prevent users from using package managers and installing applications within the container.

In the following example, the hypothetical ‘abc123’ username (now superseded by <PAS Group Name in lowercase>) runs and updates their copy of the CUDA image and then stores this updated image in their Harbor project.

  1.  Pull the container from Harbor
    $ podman pull erisxdl.partners.org/abc123/cuda:latest
    $ podman images

    REPOSITORY                         TAG      IMAGE ID       CREATED       SIZE
    erisxdl.partners.org/abc123/cuda   latest   979cd1f9e2c8   2 weeks ago   4.24 GB
  2.  Run the container and make any changes in the container, like installing additional packages *

    $ podman run -it 979cd1f9e2c8 /bin/bash

    # Or, if user abc wishes to mount both their home directory and group briefcase into the container

    $ podman run -it 979cd1f9e2c8 -v /PHShome/abc:/home/abc -v /data/briefcase:/data /bin/bash

    ## NOTE : once you run the container, you will have root privileges within the container's filesystem
    ## In this example, we install OpenGL using the package manager

    root@54116e44f656:/# apt-get upgrade
    root@54116e44f656:/# apt-get install opengl
    root@54116e44f656:/# exit

    * Container images can be run interactively as containers by using the podman run command. Users cannot run computational jobs on the ERISXdl login nodes, and should only run containers on login nodes when making modifications.

  3.  Commit the changes made as a new container image

    $ podman ps -a

    CONTAINER ID  IMAGE                                   COMMAND    CREATED             STATUS                       PORTS  NAMES
    58b3f6a7ede2 erisxdl.partners.org/abc123/cuda:latest  /bin/bash  About a minute ago  Exited (130) 10 seconds ago
    $ podman commit 58b3f6a7ede2 erisxdl.partners.org/abc123/cuda:with-opengl 
  4.  Push the modified container image to Harbor

    $ podman images

    REPOSITORY                        TAG      IMAGE ID       CREATED          SIZE
    erisxdl.partners.org/abc123/cuda with-opengl   a7932ec48e13   37 seconds ago   4.27 GB
    erisxdl.partners.org/abc123/cuda  latest   979cd1f9e2c8   2 weeks ago      4.24 GB
    $ podman push erisxdl.partners.org/abc123/cuda:with-opengl

 

Podman Settings

On ERISXdl there are three login nodes, erisxdl1, erisxdl2 and erisxdl3 and where each will contain differing collections of locally-stored images stored under

/erisxdl[1-3]/local/storage

In order to ensure the user has access to the images on a given node please locate the following file in the home directory

~/.config/containers/storage.conf

and make the following change using your favorite text editor: 

graphroot = "/erisxdl/local/storage/abc123"

where abc123 corresponds to the userid. By this means podman will operate normally on all 3 login nodes, even in the case of node failure. However, the images available on each node will be different. The images stored in Harbor at 

erisxdl.partners.org/<PAS Group Name in lowercase>

will of course always be available. If there is trouble please submit a request to hpcsupport to run

sudo ./podman_reset.sh <userID> /erisxdl/local/storage

Finally, please note that on each of the login nodes erisxdl[1-3] a user will have a quota of 50GB (as of 2024/01) for the storage of local images. The current consumption of storage under 

/erisxdl[1-3]/local/storage/<userID>

can be displayed with the following terminal command:

quota -vs

 

 

 

 

 

 

Go to KB0038877 in the IS Service Desk