In the HPC environment, you can use Python either through the command line or with Jupyter Notebook using OpenOndemand.
For information on using OpenOnDemand + Jupyter please go here. To use Python through the command line please keep reading.
Wahab & Turing employ a "homebrewed", containerized Python system that wrap around virtual environment (virtualenv). This document demenstrates how it works and how you can take full advantage of this system.
¶ Doing Deep Learning in Python?
We have provided additional Python containers described in the "Using Python for Deep Learning" section below for deep learning applications. The same structure described here will also apply for those containers, so please read on to the end.
There is also a "legacy" Python software (Python 2.7, 3.7) available using
lmod
system, which [will be documented separately] (FIXME: Documentation upcoming)
module load container_env python3
There is also a
python2
module available, for those who absolutely need to run Python 2.7:module load container_env python2
We do not recommend anyone to use Python 2 anymore, since Python 2 is offically EOL. Only use
python2
module if you have a legacy code that can't be upgrade to run on Python 3.
crun
in front of the python
command, for example:crun python3
Without extra arguments, this launches Python interpreter for interactive uses. If you want to run a script (for example, script.py
), then you invoke it in this way:crun python3 script.py
orcrun python script.py
ODU's Python containers uses Intel's Python distribution, which provides highly optimized software for Wahab's Intel Skylake hardware. This includes Intel's Math Kernel Library (MKL) for linear algebra and other mathematical operations, Intel MPI, and Intel Data Analytics Acceleration Library (DAAL).
The Python containers already come with many popular packages that are used in scientific computing, data analytics, machine learning, etc:
numpy
, scipy
pandas
matplotlib
scikit-learn
, xgboost
, tensorflow
(CPU-only)h5py
ipython
, jupyter
¶ How do I find Python packages available in the container?
Please issue the following command:
$ crun pip list
This will list all the available packages inside the container.
If you want the list of all the available packages both inside the container and in your project-specific environment (explained further below), invoke:
crun -p ~/env/PROJECT_NAME pip list
The containerized Python system allows user install python modules without involving admins, this is done through virtualenv.
Create a virtual environment
mkdir -p ~/envs
crun -c -p ~/envs/PROJECT_NAME
Please define a sensible PROJECT_NAME for the project you are working on, and we recommand you create new enviroment for new project that use unrelated libraries from the other project.
Install package through pip
(recommended):
crun -p ~/envs/PROJECT_NAME python -mpip install PACKAGE_NAME
Many packages can be installed via pip
; we recommend that you try this first before anything else. Use pip search KEYWORD
to search for the correct package name.
You MUST run
crun -c -p ~/envs/PROJECT_NAME
first before invoking thepip install
!
Install a downloaded package through setuptools
(alternative #2):
crun -p ~/envs/PROJECT_NAME python setup.py install
Use this method if you download a Python package from Internet as a tarball: extract the tarball, go into the directory of that package, then issue the command above.
Sometimes a package require additional system libraries to install and work properly. In such a case, you may not be able to complete the install yourself; please contact rcc@odu.edu for additional help.
Run python with the additional env:
crun -p ~/envs/PROJECT_NAME python script.py
Your script.py
will be able to find the packages that you install in that environment.
¶ Example: Analysis of Thousand Images
Suppose you are tasked to perform statistical analysis on thousands of images. You require the following packages:
pillow
,scikit-image
. You also wantseaborn
because you want to produce beautiful visualization. Let's call this project "VizStats". Here is a list of commands to create the environment and install the requisite libraries:module load container_env python3 mkdir -p ~/envs crun -c -p ~/envs/VizStats crun -p ~/envs/VizStats pip install pillow scikit-image seaborn
Now you are ready to write your scripts and perform your calculation! Do not forget to use
crun -p ~/envs/VizStats python
to launch your Python script.
¶ Advice for Installation
Please avoid installing packages (e.g. the
crun -p ~/envs/... pip install
command) from a job script that you may run over and over. This can create unpleasant surprises, such as packages being updated while the job is running and your results changing unexpectedly as a result of new behavior in the package. We recommend package installations be done outside your job scripts so that you know exactly what is going on.
Please read detailed information regarding the scheduler here, you would also find information here be very useful.
Get an allocation on a CPU node
salloc -c 4
Get an allocation on a GPU node
salloc -p gpu --gres gpu:1 -c 4
Job script use same options as interactive job, you can use following sample as reference.
CPU
#!/bin/bash
#SBATCH -c 4
## ... add other SBATCH options here as needed
module load container_env python3
crun -p ~/envs/PROJECT_NAME python3 script.py
GPU
#!/bin/bash
#SBATCH -p gpu
#SBATCH --gres gpu:1
#SBATCH -c 4
module load container_env python3
crun -p ~/envs/PROJECT_NAME python script.py
The following containers are closely related to python3
in that they also contain essentially the same Python software, plus frameworks used in deep learning:
tensorflow-cpu/1.15.0
tensorflow-gpu/1.15.0
tensorflow-cpu/2.2.0
tensorflow-gpu/2.2.0
These containers also come with PyTorch version 1.3 (1.7 for newer containers), Jupyter, and ipython. Newer PyTorch can be found in specific containers such as
torch-gpu/1.9.0
Please visit the TensorFlow documentation page to learn more.
Please use these deep learning containers in lieu of
python3
if you need to perform deep learning calculations! Whereas the currentpython3
container may have TensorFlow and/or Torch installed, we strongly recommend against using that container for deep learning purposes.