Gaussian is a suite of computational chemistry programs used by chemists, chemical engineers, biochemists, physicists and other scientists. Gaussian provides various ab initio and semiempirical quantum chemistry methods as well as molecular mechanics to predict energies, molecular structures, spectroscopic data (NMR, IR, UV, etc) and much more. It is released in 1970 by John Pople and his research group at Carnegie-Mellon University as Gaussian 70. It has been continuously updated since then. Gaussian09 is the latest version available on ODU Wahab and Turing clusters. It provides state-of-the-art capabilities for electronic structure modeling.
Gaussian09 is a software licensed to ODU and can only be used for non-commercial, academic research purposes by ODU community members.
Gaussian calculations are meant to run on the batch system. You can run Gaussian on either the Wahab or Turing cluster.
Please connect to the cluster using following the instructions here: Getting Started. You will need to access to the shell on the login node. Any method of connection would work (SSH, Wahab terminal on Open OnDemand, or terminal under RDP).
Our team has prepared a script named g09slurm
to launch Gaussian calculations on Turing and Wahab. Create the appropriate input file (and additional supporting files as necessary), then invoke:
jpratt@wahab-01:~/Desktop$ g09slurm INPUT_FILE.com OUTPUT_FILE.txt
This script submits your Gaussian job to SLURM to run on the main
partitions; no need to use the sbatch
command explicitly.
Gaussian input file follows a specific syntax. If you are new to Gaussian, please read the specification of the input file on Gaussian website.
¶ Filename convention
We recommend that you use the
.com
extension for the input file (com
is short for commands) and either.log
or.out
extension for the output file. For example: Suppose you want to optimize the geometry of benzene molecule using B3LYP, then your input file can bebenzene_opt_b3lyp.com
and the corresponding output filebenzene_opt_b3lyp.out
.
Here is an example input file for geometry optimization of a water molecule using RHF (restricted Hartree-Fock) method and cc-pVDZ basis set:
%mem=250MB
%chk=H2O_opt_RHF_cc-pVDZ.chk
#RHF/cc-pVDZ 5D 7F units=AU opt
H2O RHF with cc-pVDZ basis (geometry optimization)
0,1
O 0.0000000 0.0000000 -0.0090000
H 0.0000000 1.5152630 -1.0588980
H 0.0000000 -1.5152630 -1.0588980
Let us name this input file H2O_opt_RHF_cc-pVDZ.com
.
The first two lines above (that begins with the %
character) are what called link 0 commands. Each link 0 line contains a key=value
pair. The line that begins with #
indicates which computational method is requested, plus its specifications. There must be exactly one blank line after this line. The next line is the description of the calculation--write something that makes it easy for reader to understand what you are doing. This must also be followed by exactly one blank line. Then the rest contains the molecule specification. Additional input lines are possible, depending on your specific computation.
Gaussian input file follows very strict rules. The order of the sections in the input file above must NOT be changed, as well as the number of blank lines that must be included.
Please try to run Gaussian with the input above. Once completed, inspect the output file (H2O_opt_RHF_cc-pVDZ.out
) for the final output, near the end of the file:
Optimization completed.
-- Stationary point found.
----------------------------
! Optimized Parameters !
! (Angstroms and Degrees) !
-------------------------- --------------------------
! Name Definition Value Derivative Info. !
--------------------------------------------------------------------------------
! R1 R(1,2) 0.9463 -DE/DX = 0.0 !
! R2 R(1,3) 0.9463 -DE/DX = 0.0 !
! A1 A(2,1,3) 104.6141 -DE/DX = 0.0 !
--------------------------------------------------------------------------------
GradGradGradGradGradGradGradGradGradGradGradGradGradGradGradGradGradGrad
Input orientation:
---------------------------------------------------------------------
Center Atomic Atomic Coordinates (Angstroms)
Number Number Type X Y Z
---------------------------------------------------------------------
1 8 0 0.000000 0.000000 0.010573
2 1 0 0.000000 0.748794 -0.568013
3 1 0 0.000000 -0.748794 -0.568013
---------------------------------------------------------------------
Examine how the atom shifted due to optimization. The initial H-O-H angle was 110.56 degrees, the final one was slightly smaller.
You can use UNIX command grep
to examine how the total energy changed in the course of the optimization:
$ grep E.RHF H2O_opt_RHF_cc-pVDZ.out
SCF Done: E(RHF) = -76.0240385115 A.U. after 10 cycles
SCF Done: E(RHF) = -76.0269688270 A.U. after 9 cycles
SCF Done: E(RHF) = -76.0270530508 A.U. after 8 cycles
SCF Done: E(RHF) = -76.0270535127 A.U. after 7 cycles
The energy dropped by about 3 a.u.
It is typical that Gaussian calculations take a long time to finish, especially if you are dealing with either a very large molecule (say, with 100 or more atoms) or complicated methods (e.g. coupled cluster). Many computational methods in Gaussian can be run using multiple cores in shared-memory parallelism to speed up the computation. Please add a %nprocshared=N
link 0 command (near the top of the file, with other link 0 command lines) to specify the number of cores you want to use. On Wahab, choose N
between 1 and 40 (inclusive endpoints), and on Turing, between 1 and 32. The g09slurm
script will determine how many cores to request when submiting the job to SLURM.
The example above is exceedingly trivial only to show the essential parts of a Gaussian input file. For real calculation, always consider the resources you need, i.e. the amount of memory (%mem
) and the number of CPU cores (%nprocshared
).
When using multiple cores, please test using one or two calculations to find a good value for N
to use for your particular molecular system. Adding more cores may not always be beneficial! As an extreme, hypothetical example: a calculation may run for 50 minutes with 16 cores, but 45 minutes with 32 cores--the gain is diminishing by adding more cores beyond 16. Remember that parallel computing must always be used with care. Refer to LLNL's Parallel Computing tutorial, the Concepts and Terminology section to get some familiarity with parallel computing.
Gaussian has a distributed parallel version using "Linda". ODU does not have the license for this type of parallelism. If your molecules are extremely large and the calculation becomes too long (say, more than 3 days), consider using another quantum chemistry software, if possible. ODU also has GAMESS and nwchem as possible alternatives to Gaussian.
Gaussian uses a few temporary files while doing its calculation. Most notably are the read-write files (with .rwf
extension), two-electron integrals (.int
), two-electron integral derivatives (.d2e
) and checkpoint files (.chk
). Of these files, the one most useful for restarting a calculation is the checkpoint file.
Sometimes, you may want to save the checkpoint file so that you can reuse it elsewhere. For example, the saved orbitals can be a good starting point for a slightly different geometry or molecular configuration. Checkpoint file is also useful to restart a calculation that was interrupted due to errors or job running out of time. In these cases, please add the %chk=FILENAME.chk
link 0 command to make it easy for you to retrieve the checkpoint.
Gaussian uses these scratch files quite intensively, especially the read-write and integral files. For the best performance, they should reside on fast filesystems, which means the scratch partition (/scratch/$USER
on Wahab and /scratch-lustre/$USER
on Turing, where $USER
refers to your MIDAS ID). Additionally, these files can get very big and may cause your home directory to fill up.
Here are several strategies you can use:
Alternative 1. Run the entire Gaussian calculation in the scratch partition, and copy the useful files back to your home partition (the input file, *.com
; the output file *.log
or *.out
; and potentially the checkpoint file *.chk
). Please clean up the files you no longer need.
Alternative 2. Specify an explicit location on your link 0 lines. As an example, create a subdirectory called G09-scratch
under your scratch directory:
mkdir /scratch/$USER/G09-scratch
Then in your calculation, add the following link 0 command at the top of the file:
%rwf=/scratch/$USER/G09-scratch/
%nosave
! optional: save the checkpoint
%chk=MOLECULE.chk
(Replace $USER
with your MIDAS ID, and do not omit the trailing slash above.) The %nosave
command will erase the RWF file after the calculation finishes successfully. If you have multiple molecules to calculate at the same time, it may be wise to create different scratch directories for different molecules.
(NOTE: If %chk
is also specified after %nosave
like in the example above, the checkpoint data will not be erased. See http://gaussian.com/running/?tabid=2 for more details.)
In case a calculation fails, our g09slurm
script will leave behind a scratch directory named G09-<INPUTFILE>.o<SLURM_JOB_ID>.tmp
, where <INPUTFILE>
is the name of your Gaussian input file, and <SLURM_JOB_ID>
is the corresponding SLURM job ID, for example, 113309. Please inspect the contents of this subdirectory, and if applicable, restart from the checkpoint file (by default, Gau-NNNN.chk
in this subdirectory, unless you override with the %chk=FILENAME
link 0 command).
In case a calculation was terminated prematurely, either due to hardware/software errors, or due job scheduler's time limit, A Gaussian calculation can be resumed from either the checkpoint file or the "RWF" (disk read-write) file. Starting from checkpoint is easier and is more likely to succeed, whereas restarting from RWF file can get to the last point in calculation more readily, but is subject to some conditions to work.
Using Gaussian Checkpoint Files -- this article explains how one can use a checkpoint file to restart various types of calculations.
Restarting using Gaussian restart
Keyword -- Gaussian official documentation provides an example to restart a calculation using the RWF file.
Advice: If you want to continue an interrupted calculation, we suggest that you make a copy of the files (
*.com
and*.log
and*.chk
or*.rwf
) to a separate folder and restart the calculation from that folder. In this way, you preserve the files from the previous calculation. Your calculation will be the combination of the original calculation and the resumed calculation.
Be advised that RWF file can be extremely big--please put that in the scratch partition and delete it as soon as the computation is completed.
As of 2020, we currently have Gaussian 09 rev D1. The most current version of Gaussian documented on the vendor's website is Gaussian 16. Most parts of the manual are still applicable for Gaussian 09, but do exercise caution and test if the features work as expected. In particular, the "new methods" or "new capabilities" advertised for Gaussian 16 are not available in Gaussian 09. One stark difference is that Gaussian 16 supports some calculations on GPU, whereas our Gaussian 09 does not support GPU at all.