ODU RCS has adopted Singularity as the framework to deploy containerized software in ODU HPC environment. Newer software packages are furnished as Singularity containers for many good reasons:
Containers provide a way to package software and its dependencies in a single container image.
Containers are meant to be portable (at least, in principle). Containers prepared for a particular machine can be run in other machines that have the same CPU architecture. We utilize this feature to provide the same containers to run on both Turing (our older HPC) and Wahab (the new HPC cluster).
First, you will need to load the container_env
package:
$ module load container_env
Then the module avail
command will show additional modules that are containerized software, for example:
------------------------------- /cm/shared/mls/Containers/0.1 -----------------------------
R/4.0.0 (D) mamba/OS8-0.2 (D)
R/4.0.3 mapdamage2/2.2.1
allpaths-lg/52488 meme/5.1.1
anet2016/2018.11.18 meshalyzer/2.1
angsd/0.911 meshalyzer/2.2 (D)
ansys/19 modeltest-ng/0.1.6
ants/2.3.4 mono/6.12.0
artdeco/2020.05.08 mpich/3.3.2-no-slurm
base/FC32-0.1 mpich/3.3.2 (D)
base/OS8-0.1d multiqc/1.9
base/OS8-0.1 multiwfn/3.8
base/OS8-0.2 ncbi-cxx/22.0.0
base/OS8-0.3 nwchem/7.0.0
base/OS8-0.4 nwchem/7.0.2-stage1
base/0.1 nwchem/7.0.2 (D)
base/0.2 oceanM/2020.09.28
base/0.4 (D) opencarp/6.0-s1
bayestraits/3.0.2 opencarp/6.0 (D)
bcftools/1.10.2 openfoam/6-stage1
bismark/0.22.3 openfoam/6-stage2
blender/2.93 openfoam/6 (D)
...
Note that these are just a few of the containers we provide! The list of the available containers changes all the time, so please check the available software directly on Turing or Wahab.
Let us try the tensorflow-cpu
container as a concrete example. (This is a container for Python 3 with many preinstalled packages for machine learning, including TensorFlow and Torch targeting CPU.)
$ module load tensorflow-cpu
How to run software in this container? We add the crun
prefix to the program name:
$ crun python
Python 3.7.7 (default, Sep 11 2020, 20:43:12)
[GCC 7.3.0] :: Intel Corporation on linux
Type "help", "copyright", "credits" or "license" for more information.
Intel(R) Distribution for Python is brought to you by Intel Corporation.
Please check out: https://software.intel.com/en-us/python-distribution
>>>
Without the crun
prefix, you will get a much older Python version on Wahab:
$ python
Python 2.7.17 (default, Feb 27 2021, 15:10:58)
[GCC 7.5.0] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>>
Since all containers come with its own crun
command, how can we ascertain with container exactly that we are running? We provide crun.MODULE_NAME
, where MODULE_NAME
stands for the name of the environment module for that particular container, as a way to remove ambiguity of the containers. For example, you can invoke
$ crun.tensorflow-cpu python
to run the python
command inside the tensorflow-cpu
container.
We strongly recommend that you use the disambiguated container launcher name (the
crun.MODULE_NAME
syntax) instead ofcrun
in job scripts to avoid confusion, unless you know exactly that there can only be one container module loaded. Since modules from interactive shells are inherited when launching a batch job, you may find surprises unless you usemodule purge
followed by the requisitemodule load
at the beginning of a job script.
The crun.MODULE_NAME
is actually an executable Singularity container image that has its own root filesystem. It contains its own shell, utility programs, libraries, etc. You can run the shell inside the container to explore what's in the container. For example,
$ crun.tensorflow-cpu /bin/bash
This bash shell will run in the container. You can type ls /
and compare the difference between the root filesystem inside the container and outside the container (i.e. direct ls /
command on Wahab command-line shell).
From the container, you have access to all your data located in /home/$USER
, /RC/home/$USER
, where $USER
refers to your user ID in the HPC environment (usually the same as your MIDAS ID). At this point in time, /RC
(long-term reseaarch storage) is not available. Please contact us if your workflow requires it.
You also have access to system-wide shared folder that has the programs, i.e. /cm/shared
(Turing & Wahab), and shared
(Wahab only), so you can actually use some of the codes, shared data, etc. located there, provided they can run with the libraries provided inside the container (this is not always guaranteed, as it was not guaranteed that the HPC programs outside the containers will work inside the containers as well).