MITgcm
The Massachusetts Institute of Technology General Circulation Model (MITgcm) is a numerical model designed for study of the atmosphere, ocean, and climate. MITgcm's flexible non-hydrostatic formulation enables it to simulate fluid phenomena over a wide range of scales; its adjoint capabilities enable it to be applied to sensitivity questions and to parameter and state estimation problems. By employing fluid equation isomorphisms, a single dynamical kernel can be used to simulate flow of both the atmosphere and ocean.
Useful Links
Building MITgcm on ARCHER2
MITgcm is not available via a module on ARCHER2 as users will build their own executables specific to the problem they are working on.
You can obtain the MITgcm source code from the developers by cloning from the GitHub repository with the command
git clone https://github.com/MITgcm/MITgcm.git
You should then copy the ARCHER2 optfile into the MITgcm directories.
Warning
A current ARCHER2 optfile is not available at the present time. Please
contact support@archer2.ac.uk
for help.
You should also set the following environment variables.
MITGCM_ROOTDIR
is used to locate the source code and should point to
the top MITgcm directory. Optionally, adding the MITgcm tools directory
to your PATH
environment variable makes it easier to use tools such as
genmake2
, and the MITGCM_OPT
environment variable makes it easier to
refer to pass the optfile to genmake2
.
export MITGCM_ROOTDIR=/path/to/MITgcm
export PATH=$MITGCM_ROOTDIR/tools:$PATH
export MITGCM_OPT=$MITGCM_ROOTDIR/tools/build_options/dev_linux_amd64_cray_archer2
When using genmake2
to create the Makefile, you will need to specify the
optfile to use. Other commonly used options might be to use extra source
code with the -mods
option, to enable MPI with -mpi
, and to enable OpenMP with -omp
. You might
then run a command that resembles the following:
genmake2 -mods /path/to/additional/source -mpi -optfile $MITGCM_OPT
You can read about the full set of options available to genmake2
by
running
genmake2 -help
Finally, you may then build your executable by running
make depend
make
Running MITgcm on ARCHER2
Pure MPI
Once you have built your executable you can write a script like the following which will allow it to run on the ARCHER2 compute nodes. This example would run a pure MPI MITgcm simulation over 2 nodes of 128 cores each for up to one hour.
#!/bin/bash
# Slurm job options (job-name, compute nodes, job time)
#SBATCH --job-name=MITgcm-simulation
#SBATCH --time=1:0:0
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=128
#SBATCH --cpus-per-task=1
# Replace [budget code] below with your project code (e.g. t01)
#SBATCH --account=[budget code]
#SBATCH --partition=standard
#SBATCH --qos=standard
# Set the number of threads to 1
# This prevents any threaded system libraries from automatically
# using threading.
export OMP_NUM_THREADS=1
# Ensure the cpus-per-task option is propagated to srun commands
export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
# Launch the parallel job
# Using 256 MPI processes and 128 MPI processes per node
# srun picks up the distribution from the sbatch options
srun --distribution=block:block --hint=nomultithread ./mitgcmuv
Hybrid OpenMP & MPI
Warning
Running the model in hybrid mode may lead to performance decreases as well as increases. You should be sure to profile your code both as a pure MPI application and as a hybrid OpenMP-MPI application to ensure you are making efficient use of resources. Be sure to read both the Archer2 advice on OpenMP and the MITgcm documentation first.
Note
Early versions of the ARCHER2 MITgcm optfile do not contain an OMPFLAG
.
Please ensure you have an up to date copy of the optfile before attempting
to compile OpenMP enabled codes.
Depending upon your model setup, you may wish to run the MITgcm code as a
hybrid OpenMP-MPI application. In terms of compiling the model, this is as
simple as using the flag -omp
when calling genmake2
, and updating your
SIZE.h
file to have multiple tiles per process.
The model can be run using a slurm job submission script similar to that shown below. This example will run MITgcm across 2 nodes, with each node using 16 MPI processes, and each process using 4 threads. Note that this would underpopulate the nodes — i.e. we will only be using 128 of the 256 cores available to us. This can also sometimes lead to performance increases.
#!/bin/bash
# Slurm job options (job-name, compute nodes, job time)
#SBATCH --job-name=MITgcm-hybrid-simulation
#SBATCH --time=1:0:0
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=16
#SBATCH --cpus-per-task=4
# Replace [budget code] below with your project code (e.g. t01)
#SBATCH --account=[budget code]
#SBATCH --partition=standard
#SBATCH --qos=standard
# Set the number of threads to 1
# This prevents any threaded system libraries from automatically
# using threading.
export OMP_NUM_THREADS=4 # Set to number of threads per process
export OMP_PLACES="cores(128)" # Set to total number of threads
export OMP_PROC_BIND=true # Required if we want to underpopulate nodes
# Ensure the cpus-per-task option is propagated to srun commands
export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
# Launch the parallel job
# Using 256 MPI processes and 128 MPI processes per node
# srun picks up the distribution from the sbatch options
srun --distribution=block:block --hint=nomultithread ./mitgcmuv
One final note, is that you should remember to update the eedata
file in the
model's run directory to ensure the number of threads requested there match
those requested in the job submission script.
Reproducing the ECCO version 4 (release 4) state estimate on ARCHER2
The ECCO version 4 state estimate (ECCOv4-r4) is an observationally-constrained numerical solution produced by the ECCO group at JPL. If you would like to reproduce the state estimate on ARCHER2 in order to create customised runs and experiments, follow the instructions below. They have been slightly modified from the JPL instructions for ARCHER2.
For more information, see the ECCOv4-r4 website https://ecco-group.org/products-ECCO-V4r4.htm
Get the ECCOv4-r4 source code
First, navigate to your directory on the /work
filesystem in order to get access to the compute nodes. Next, create a working directory, perhaps MYECCO, and navigate into this working directory:
mkdir MYECCO
cd MYECCO
In order to reproduce ECCOv4-r4, we need a specific checkpoint of the MITgcm source code.
git clone https://github.com/MITgcm/MITgcm.git -b checkpoint66g
Next, get the ECCOv4-r4 specific code from GitHub:
cd MITgcm
mkdir -p ECCOV4/release4
cd ECCOV4/release4
git clone https://github.com/ECCO-GROUP/ECCO-v4-Configurations.git
mv ECCO-v4-Configurations/ECCOv4\ Release\ 4/code .
rm -rf ECCO-v4-Configurations
Get the ECCOv4-r4 forcing files
The surface forcing and other input files that are too large to be stored on GitHub are available via NASA data servers. In total, these files are about 200 GB in size. You must register for an Earthdata account and connect to a WebDAV server in order to access these files. For more detailed instructions, read the help page https://ecco.jpl.nasa.gov/drive/help.
First, apply for an Earthdata account: https://urs.earthdata.nasa.gov/users/new
Next, acquire your WebDAV credentials: https://ecco.jpl.nasa.gov/drive (second box from the top)
Now, you can use wget to download the required forcing and input files:
wget -r --no-parent --user YOURUSERNAME --ask-password https://ecco.jpl.nasa.gov/drive/files/Version4/Release4/input_forcing
wget -r --no-parent --user YOURUSERNAME --ask-password https://ecco.jpl.nasa.gov/drive/files/Version4/Release4/input_init
wget -r --no-parent --user YOURUSERNAME --ask-password https://ecco.jpl.nasa.gov/drive/files/Version4/Release4/input_ecco
After using wget
, you will notice that the input*
directories are, by default, several levels deep in the directory structure. Use the mv
command to move the input*
directories to the directory where you executed the wget
command. Specifically,
mv ecco.jpl.nasa.gov/drive/files/Version4/Release4/input_forcing/ .
mv ecco.jpl.nasa.gov/drive/files/Version4/Release4/input_init/ .
mv ecco.jpl.nasa.gov/drive/files/Version4/Release4/input_ecco/ .
rm -rf ecco.jpl.nasa.gov
Compiling and running ECCOv4-r4
The steps for building the ECCOv4-r4 instance of MITgcm are very similar to those for other build cases. First, wou will need to create a build directory:
cd MITgcm/ECCOV4/release4
mkdir build
cd build
Load the NetCDF modules:
module load cray-hdf5
module load cray-netcdf
If you haven't already, set your environment variables:
export MITGCM_ROOTDIR=../../../../MITgcm
export PATH=$MITGCM_ROOTDIR/tools:$PATH
export MITGCM_OPT=$MITGCM_ROOTDIR/tools/build_options/dev_linux_amd64_cray_archer2
Next, compile the executable:
genmake2 -mods ../code -mpi -optfile $MITGCM_OPT
make depend
make
Once you have compiled the model, you will have the mitgcmuv executable for ECCOv4-r4.
Create run directory and link files
In order to run the model, you need to create a run directory and link/copy the appropriate files. First, navigate to your directory on the work
filesystem. From the MITgcm/ECCOV4/release4
directory:
mkdir run
cd run
# link the data files
ln -s ../input_init/NAMELIST/* .
ln -s ../input_init/error_weight/ctrl_weight/* .
ln -s ../input_init/error_weight/data_error/* .
ln -s ../input_init/* .
ln -s ../input_init/tools/* .
ln -s ../input_ecco/*/* .
ln -s ../input_forcing/eccov4r4* .
python mkdir_subdir_diags.py
# manually copy the mitgcmuv executable
cp -p ../build/mitgcmuv .
For a short test run, edit the nTimeSteps
variable in the file data
. Comment out the default value and uncomment the line reading nTimeSteps=8
. This is a useful test to make sure that the model can at least start up.
To run on ARCHER2, submit a batch script to the Slurm scheduler. Here is an example submission script:
#!/bin/bash
# Slurm job options (job-name, compute nodes, job time)
#SBATCH --job-name=ECCOv4r4-test
#SBATCH --time=1:0:0
#SBATCH --nodes=8
#SBATCH --ntasks-per-node=12
#SBATCH --cpus-per-task=1
# Replace [budget code] below with your project code (e.g. t01)
#SBATCH --account=[budget code]
#SBATCH --partition=standard
#SBATCH --qos=standard
# For adjoint runs the default cpu-freq is a lot slower
#SBATCH --cpu-freq=2250000
# Set the number of threads to 1
# This prevents any threaded system libraries from automatically
# using threading.
export OMP_NUM_THREADS=1
# Ensure the cpus-per-task option is propagated to srun commands
export SRUN_CPUS_PER_TASK=$SLURM_CPUS_PER_TASK
# Launch the parallel job
# Using 256 MPI processes and 128 MPI processes per node
# srun picks up the distribution from the sbatch options
srun --distribution=block:block --hint=nomultithread ./mitgcmuv
This configuration uses 96 MPI processes at 12 MPI processes per node. Once the run has finished, in order to check that the run has successfully completed, check the end of one of the standard output files.
tail STDOUT.0000
It should read
PROGRAM MAIN: Execution ended Normally
The files named STDOUT.*
contain diagnostic information that you can use to check your results. As a first pass, check the printed statistics for any clear signs of trouble (e.g. NaN values, extremely large values).
ECCOv4-r4 in adjoint mode
If you have access to the commercial TAF software produced by http://FastOpt.de, then you can compile and run the ECCOv4-r4 instance of MITgcm in adjoint mode. This mode is useful for comprehensive sensitivity studies and for constructing state estimates. From the MITgcm/ECCOV4/release4
directory, create a new code directory and a new build directory:
mkdir code_ad
cd code_ad
ln -s ../code/* .
cd ..
mkdir build_ad
cd build_ad
In this instance, the code_ad
and code
directories are identical, although this does not have to be the case. Make sure that you have the staf
script in your path or in the build_ad
directory itself. To make sure that you have the most up-to-date script, run:
./staf -get staf
To test your connection to the FastOpt servers, try:
./staf -test
You should receive the following message:
Your access to the TAF server is enabled.
The compilation commands are similar to those used to build the forward case.
# load relevant modules
module load cray-netcdf-hdf5parallel
module load cray-hdf5-parallel
# compile adjoint model
../../../MITgcm/tools/genmake2 -ieee -mpi -mods=../code_ad -of=(PATH_TO_OPTFILE)
make depend
make adtaf
make adall
The source code will be packaged and forwarded to the FastOpt servers, where it will undergo source-to-source translation via the TAF algorithmic differentiation software. If the compilation is successful, you will have an executable named mitgcmuv_ad
. This will run the ECCOv4-r4 configuration of MITgcm in adjoint mode. As before, create a run directory and copy in the relevant files. The procedure is the same as for the forward model, with the following modifications:
cd ..
mkdir run_ad
cd run_ad
# manually copy the mitgcmuv executable
cp -p ../build_ad/mitgcmuv_ad .
To run the model, change the name of the executable in the Slurm submission script; everything else should be the same as in the forward case. As above, at the end of the run you should have a set of STDOUT.*
files that you can examine for any obvious problems.
Compile time errors
If TAF compilation fails with an error like failed to convert GOTPCREL relocation;
relink with --no-relax
then add the following line to the FFLAGS options: -Wl,--no-relax
.
Checkpointing for adjoint runs
In an adjoint run, there is a balance between storage (i.e. saving the model state to disk) and recomputation (i.e. integrating the model forward from a stored state). Changing the nchklev
parameters in the tamc.h
file at compile time is how you control the relative balance between storage and recomputation.
A suggested strategy that has been used on a variety of HPC platforms is as follows:
1. Set nchklev_1
as large as possible, up to the size allowed by memory on your machine. (Use the size
command to estimate the memory per process. This should be just a little bit less than the maximum allowed on the machine. On ARCHER2 this is 2 GB (standard) and 4 GB (high memory)).
2. Next, set nchklev_2
and nchklev_3
to be large enough to accommodate the entire run. A common strategy is to set nchklev_2 = nchklev_3 = sqrt(numsteps/nchklev_1) + 1
.
3. If the nchklev_2
files get too big, then you may have to add a fourth level (i.e. nchklev_4
), but this is unlikely.
This strategy allows you to keep as much in memory as possible, minimising the I/O requirements for the disk. This is useful, as I/O is often the bottleneck for MITgcm runs on HPC.
Another way to adjust performance is to adjust how tapelevel I/O is handled. This strategy performs well for most configurations:
C o tape settings
#define ALLOW_AUTODIFF_WHTAPEIO
#define AUTODIFF_USE_OLDSTORE_2D
#define AUTODIFF_USE_OLDSTORE_3D
#define EXCLUDE_WHIO_GLOBUFF_2D
#define ALLOW_INIT_WHTAPEIO