
Waterfield is Old Dominion University’s latest production high-performance computing (HPC) cluster on Google Cloud, managed by the Research & Cloud Computing (RCC) group within Digital Transformation & Technology (DTT).
To request access to the Waterfield HPC cluster, please complete the RCC access request form:
https://forms.odu.edu/view.php?id=93440
You will receive an email once approved and your account will be automatically provisioned.
Note: This access request form only needs to be completed once to activate HPC services at ODU. You do not need to submit a separate request for each cluster (e.g., Wahab, Turing, Waterfield). If you already have HPC access, you do not need to fill out this form again.
Waterfield is accessed via standard SSH over the public internet.
Login hostname: waterfield.hpc.odu.edu.
From your local machine, run:
ssh <your_midas_username>@waterfield.hpc.odu.edu
You will authenticate using your ODU MIDAS credentials and complete Duo two-factor authentication (2FA) when prompted. After login, you will be connected to the Waterfield login node where you can browse files, load software modules, submit SLURM jobs, and monitor running jobs.
For more general information on getting started, see: https://wiki.hpc.odu.edu/en/GettingStarted
Open OnDemand is a web portal that provides rich access to HPC systems. It allows users to view, edit, upload and download files; create, edit, submit and monitor jobs; run interactive GUI applications such as Jupyter Notebook and Desktops.
Access to Open OnDemand for Waterfield at the following address using your Midas credentials:
https://ondemand.waterfield.hpc.odu.edu
For more information on Open OnDemand, see: https://wiki.hpc.odu.edu/en/open-ondemand.
Each user is provided with a home directory and scratch space on Waterfield.
Your home directory is located at:
/home/<your_midas_username>
Home directories are hosted on a Google Filestore filesystem and are intended for persistent user data such as scripts, source code, configuration files, and smaller datasets.
In addition to your home directory, Waterfield provides high-performance scratch space:
/scratch/<your_midas_username>
Scratch space is hosted on a Lustre filesystem and is intended for temporary, high-throughput data generated by running jobs.
A shared filesystem is also available:
/cm/shared
This directory is copied from the on-prem clusters and provides shared software and data for use across systems.
Note: Storage policies for Waterfield are still evolving. At this time, data in
/scratchis not backed up, and automatic purge policies are under discussion. Users should treat/scratchas temporary job space and are responsible for managing and cleaning up their own data.
Waterfield enforces per-user storage quotas:
If additional storage is required for a research project, please email rcc@odu.edu with your request and justification. Quotas may be expanded on a case-by-case basis.
Data can be transferred to and from Waterfield using standard tools such as scp or rsync.
Example: Copy a file from Wahab to your home directory on Waterfield
From the Waterfield login node:
scp <midas_username>@wahab.hpc.odu.edu:~/wahab_file.txt .
To copy directories, use:
scp -r <midas_username>@wahab.hpc.odu.edu:~/mydir .
For large or repeated transfers, rsync is recommended:
rsync -av <midas_username>@wahab.hpc.odu.edu:~/mydir .
We are also evaluating Globus as a long-term solution for large-scale data transfers.
Software on Waterfield is accessed using the Lmod environment module system, similar to Wahab and Turing.
View available software: module avail
View loaded modules: module list
Load a module: module load python3
crunOn Waterfield, application execution is performed using crun, which is required. Each application module is backed by a container image, and crun acts as a wrapper to launch the appropriate container.
Example: module load python3 && crun python3 --version
If multiple application modules are loaded, you may also use application-specific wrappers such as crun.alphafold or crun.pytorch-gpu.
This model ensures that software runs in consistent, isolated environments and avoids conflicts between system libraries.
For Python-based workflows, users are encouraged to use Python virtual environments rather than requesting system-wide package installations. Guidance for setting up Python virtual environments on ODU systems is available here: https://wiki.hpc.odu.edu/en/Software/Python
If you require software that is not currently available through modules or containers, please email rcc@odu.edu with your request.
Waterfield uses SLURM for job scheduling, similar to Wahab and Turing. Most SLURM commands and submission scripts will work the same, with only minor adjustments required. Partition names differ between systems.
To view available partitions, run:
sinfo
| Partition Pattern | Type | Resources Provided | Typical Use Case |
|---|---|---|---|
cpu-2 |
Standard CPU | 2 CPU cores | Small jobs, testing, lightweight scripts (default) |
cpu-32 |
Standard CPU | 32 CPU cores | Most CPU workloads |
cpuflex-192 |
Flex CPU | 192 CPU cores | Very large or memory-intensive CPU jobs |
rtxp6000flex-* |
Flex GPU | 1, 2, 4, or 8 RTX Pro 6000 GPUs | GPU workloads needing moderate acceleration |
h100flex-* |
Flex GPU | 1, 2, 4, or 8 H100 GPUs | Large-scale deep learning |
h200flex-8 |
Flex GPU | 8 H200 GPUs | Extreme-scale training / HPC workloads |
b200flex-8 |
Flex GPU | 8 B200 GPUs | Extreme-scale training / HPC workloads |
Partition names follow this general pattern:
<type><flex?>-<count>
Examples:
cpu-32 → 32 CPU cores on a standard nodecpuflex-192 → 192 CPU cores on a large on-demand nodeh100flex-4 → 4 H100 GPUsThe numeric suffix indicates the number of compute devices.
Note: Always request the smallest partition that meets your needs.
General guidance:
cpu-2, cpu-32) for most workloads.Flex partitions map to much larger and more expensive cloud instances. Over-requesting resources increases queue times and cloud costs.
#!/bin/bash
#SBATCH -J python-cpu
#SBATCH -p cpu-2
module load python3
crun python3 --version
#!/bin/bash
#SBATCH -J gpu-test
#SBATCH -p h100flex-1
#SBATCH --gres=gpu:1
module load python3
crun python3 gpu.py
Important: All Waterfield compute resources are centrally funded and incur real cloud costs. Users are expected to request only the resources they actually need. Requesting significantly more resources than required (for example, requesting an 8-GPU node while only using 1 GPU) is considered misuse of the system. Resource usage is monitored by RCC, and abuse may result in loss of access to Waterfield.
You are now ready to begin submitting jobs on the Waterfield HPC cluster.
Q: Is there a cost associated with using the Waterfield HPC cluster?
A: Yes. All jobs and storage on Waterfield incur real cloud costs. At this time, Waterfield is centrally funded by DTT. However, usage is actively monitored. Users are expected to request only the resources they need, as excessive or inefficient usage can significantly increase costs.
Q: Is there a storage quota?
A: Waterfield has standard per-user storage allocation targets of 1 TB for scratch space. These allocations are managed administratively to ensure fair use of shared resources. If additional storage is required for a research project, please email rcc@odu.edu with your request.
Q: Is data in /scratch persistent?
A: No. /scratch is intended for temporary, high-performance job data. It is not backed up, and purge policies are under discussion. Users should treat /scratch as disposable space and store important data in their home directory.
Q: How is Waterfield different from Wahab or Turing?
A: Waterfield is a cloud-based HPC system where compute nodes are provisioned dynamically on demand. Storage systems are separate from on-prem clusters. The overall workflow (SLURM jobs, modules, crun) is similar, but resources map directly to cloud infrastructure.
Q: How do I request new software?
A: Email rcc@odu.edu with your request. RCC maintains a curated software environment and may install commonly used tools system-wide when appropriate. For most custom environments, users are encouraged to use containers or Python virtual environments.
Q: How do I transfer data in and out of Waterfield?
A: You can use standard tools such as scp or rsync from the login node. For large-scale data transfers, RCC is evaluating Globus as a long-term solution.
Q: Do GPU jobs cost more than CPU jobs?
A: Yes. GPU nodes are significantly more expensive than CPU nodes. Submitting a GPU job provisions a GPU-enabled cloud instance and immediately incurs higher costs. Users should only request GPUs when their application explicitly requires them.
Q: Do I need to use crun to run software?
A: Yes. On Waterfield, all application execution is performed using crun, which launches container-based environments for each application module. This ensures consistent and reproducible software behavior across compute nodes.
Q: Can I use interactive environments like Jupyter or OnDemand?
A: Yes. An Open OnDemand portal is currently being deployed at https://ondemand.waterfield.hpc.odu.edu, which will provide browser-based access to interactive sessions, including Jupyter and desktop environments. Supported features will be announced as the service enters production.
Q: What happens if I misuse resources or over-request compute?
A: Waterfield usage is actively monitored by RCC. Requesting significantly more resources than required (for example, using large GPU nodes for minimal workloads) is considered misuse and may result in restricted or revoked access.
Q: Who should I contact if I run into issues?
A: Please email rcc@odu.edu with a description of the issue, relevant error messages, and any job scripts if applicable.