Energy use
This section covers how to monitor energy use for your jobs on ARCHER2 and how to control the CPU frequency which allows some control over how much energy is consumed by jobs.
Important
The default CPU frequency cap on ARCHER2 compute nodes for jobs launched using srun
is currently set
to 2.0 GHz. Information below describes how to control the CPU frequency cap using Slurm.
Monitoring energy use
The Slurm accounting database stores the total energy consumed by a job and you can also directly access the counters on compute nodes which capture instantaneous power and energy data broken down by different hardware components.
Using sacct to get energy usage for individual jobs
Energy usage for a particular job may be obtained using the sacct
command. For instance
sacct -j 2658300 --format=JobID,Elapsed,ReqCPUFreq,ConsumedEnergy
will provide the elapsed time and consumed energy in joules for the job(s) specified with -j
.
The output of this command is:
JobID Elapsed ReqCPUFreq ConsumedEnergy
------------ ---------- ---------- --------------
2658300 02:19:48 Unknown 4.58M
2658300.bat+ 02:19:48 0 4.58M
2658300.ext+ 02:19:48 0 4.58M
2658300.0 02:19:09 Unknown 4.57M
In this case we can see that the job consumed 4.58 MJ for a run lasting 2 hours, 19 minutes and 48 seconds with the CPU frequency unset. To convert the energy to kWh we can multiply the energy in joules by 2.78e-7, in this case resulting in 1.27 kWh.
The Slurm database may be cleaned without notice so you should gather any data you want as soon as possible after
the job completes - you can even add the sacct
command to the end of your job script to ensure this data is
captured.
In addition to energy statistics sacct
provides a number of other statistics that can be specified to the --format
option, the full list of which can be viewed with
sacct --helpformat
or using the man
pages.
Accessing the node energy/power counters
Note
The counters are available on each compute node and record data only for that compute node. If you are running multi-node jobs, you will need to combine data from multiple nodes to get data for the whole job.
On compute nodes, the raw energy counters and instantaneous power draw data are available at:
/sys/cray/pm_counters
There are a number of files in this directory, all the counter files include the current value and a timestamp.
- power - Point-in-time power (Watts).
- energy - Accumulated energy (Joules).
- cpu_power - Point-in-time power (Watts) used by the CPU domain.
- cpu_energy - The total energy (Joules) used by the CPU domain.
- cpu*_temp - Temperature reading (Celsius) of the CPU domain - one file per CPU socket.
- memory_power - Point-in-time power (Watts) used by the memory domain.
- memory_energy - The total energy (Joules) used by the memory domain.
- generation - A counter that increments each time a power cap value is changed.
- startup - Startup counter.
- freshness - Free-running counter that increments at a rate of approximately 10Hz.
- version - Version number for power management counter support.
- power_cap - Current power cap limit in Watts; 0 indicates no capping.
- raw_scan_hz - The power management scanning rate for all data in pm_counters.
This documentation is from the official HPE documentation:
Tip
The overall power
and energy
counters include all on-node systems. The major components
are the CPU (processor), memory and Slingshot network interface controller (NIC).
Note
There exists an MPI-based wrapper library that can gather the pm
counter values at runtime via a simple
set of function calls. See the link below for details.
Controlling CPU frequency
You can request specific CPU frequency caps (in kHz) for compute nodes through srun
options or environment variables.
The available frequency caps on the ARCHER2 processors along with the options and environment variables:
Frequency | srun option |
Slurm environment variable | Turbo boost enabled? |
---|---|---|---|
2.25 GHz | --cpu-freq=2250000 |
export SLURM_CPU_FREQ_REQ=2250000 |
Yes |
2.00 GHz | --cpu-freq=2000000 |
export SLURM_CPU_FREQ_REQ=2000000 |
No |
1.50 GHz | --cpu-freq=1500000 |
export SLURM_CPU_FREQ_REQ=1500000 |
No |
The only frequency caps available on the processors on ARCHER2 are 1.5 GHz, 2.0 GHz and 2.25GHz+turbo.
Important
Setting the CPU frequency cap in this way sets the maximum frequency that the processors can use. In practice, the individual cores may select different frequencies up to the value you have set depending on the workload on the processor.
Important
When you select the highest frequency value (2.25 GHz), you also enable turbo boost and so the processor is free to set the CPU frequency to values above 2.25 GHz if possible within the power and thermal limits of the processor. We see that, with turbo boost enabled, the processors typically boost to around 2.8 GHz even when performing compute-intensive work.
For example, you can add the following option to srun
commands in your job submission scripts to set the CPU frequency
to 2.25 GHz (and also enable turbo boost):
srun --cpu-freq=2250000 ...usual srun options and arguments...
Alternatively, you could add the following line to your job submission script before you use srun
to launch the application:
export SLURM_CPU_FREQ_REQ=2250000
Tip
Testing by the ARCHER2 CSE team has shown that most software are most energy efficient when 2.0 GHz is selected as the CPU frequency.
Important
The CPU frequency settings only affect applications launched using the srun
command.
Priority of frequency settings:
- The default
SLURM_CPU_FREQ_REQ
setting set by the ARCHER2 service applies if no other mechnism is used to set the CPU frequency - Setting the
SLURM_CPU_FREQ_REQ
environment variable in a job script overrides options provided the default environment variable setting for any subsequentsrun
commands in the job script. - Adding the
--cpu-freq=<freq in kHz>
option to thesrun
launch command itself overrides all other options.
Tip
Adding the --cpu-freq=<freq in kHz>
option to sbatch
(e.g. using #SBATCH --cpu-freq=<freq in kHz>
will not change the CPU frequency of srun
commands used in the job as the default setting for ARCHER2
will override the sbatch
option when the script runs.
Default CPU frequency
If you do not specify a CPU frequency then you will get the default setting for the ARCHER2 service
when you lanch an application using srun
.
The table below lists the history of default CPU frequency settings on the ARCHER2 service
Date range | Default CPU frequency |
---|---|
12 Dec 2022 - current date | 2.0 GHz |
Nov 2021 - 11 Dec 2022 | Unspecified - defaults to 2.25 GHz |
Slurm CPU frequency settings for centrally-installed software
Most centrally installed research software (available via module load
commands) uses the same default Slurm CPU frequency as set globally for all ARCHER2 users (see above
for this value). However, a small number of software have performance that is significantly
degraded by using lower frequency settings and so the modules for these packages reset the
CPU frequency to the highest value (2.25 GHz). The packages that currently do this are:
Important
If you specify the Slurm CPU frequency in your job scripts using one of the mechanisms described above after you have loaded the module, you will override the setting from the module.