Using Python
Python is supported on ARCHER2 both for running intensive parallel jobs and also as an analysis tool. This section describes how to use Python in either of these scenarios.
The Python installations on ARCHER2 contain some of the most commonly
used packages. If you wish to install additional Python packages, we
recommend that you use the pip
command, see the section entitled
Installing your own Python packages (with pip).
Important
Python 2 is not supported on ARCHER2 as it has been deprecated since the start of 2020.
Note
When you log onto ARCHER2, no Python module is loaded by default. You
will generally need to load the cray-python
module to access the
functionality described below.
HPE Cray Python distribution
The recommended way to use Python on ARCHER2 is to use the HPE Cray Python distribution.
The HPE Cray distribution provides Python 3 along with some of the most common packages used for scientific computation and data analysis. These include:
- numpy and scipy - built using GCC against HPE Cray LibSci
- mpi4py - built using GCC against HPE Cray MPICH
- dask
The HPE Cray Python distribution can be loaded (either on the front-end or in a submission script) using:
module load cray-python
Tip
The HPE Cray Python distribution is built using GCC compilers. If you wish
to compile your own Python, C/C++ or Fortran code to use with HPE Cray
Python, you should ensure that you compile using PrgEnv-gnu
to make sure
they are compatible.
Installing your own Python packages (with pip)
Sometimes, you may need to setup a local custom Python environment such that it extends a centrally-installed cray-python
module.
By extend, we mean being able to install packages locally that are not provided by cray-python
. This is necessary because some Python
packages such as mpi4py
must be built specifically for the ARCHER2 system and so are best provided centrally.
You can do this by creating a lightweight virtual environment where the local packages can be installed. This environment is created on top of an existing Python installation, known as the environment's base Python.
First, load the PrgEnv-gnu
environment.
auser@ln01:~> module load PrgEnv-gnu
This first step is necessary because subsequent pip
installs may involve source code compilation and it is better that this be done using
the GCC compilers to maintain consistency with how some base Python packages have been built.
Second, select the base Python by loading the cray-python
module that you wish to extend.
auser@ln01:~> module load cray-python
Next, create the virtual environment within a designated folder.
python -m venv --system-site-packages /work/t01/t01/auser/myvenv
In our example, the environment is created within a myvenv
folder located on /work
, which means the environment will be accessible
from the compute nodes. The --system-site-packages
option ensures this environment is based on the currently loaded cray-python
module. See https://docs.python.org/3/library/venv.html for more details.
You're now ready to activate your environment.
source /work/t01/t01/auser/myvenv/bin/activate
Tip
The myvenv
path uses a fictitious project code, t01
, and username, auser
. Please remember to replace those values
with your actual project code and username. Alternatively, you could enter ${HOME/home/work}
in place of /work/t01/t01/auser
.
That command fragment expands ${HOME}
and then replaces the home
part with work
.
Installing packages to your local environment can now be done as follows.
(myvenv) auser@ln01:~> python -m pip install <package name>
Running pip
directly as in pip install <package name>
will also work, but we show the python -m
approach
as this is consistent with the way the virtual environment was created. Further, if the package installation
will require code compilation, you should amend the command to ensure use of the ARCHER2 compiler wrappers.
(myvenv) auser@ln01:~> CC=cc CXX=CC FC=ftn python -m pip install <package name>
And when you have finished installing packages, you can deactivate the environment by running the deactivate
command.
(myvenv) auser@ln01:~> deactivate
auser@ln01:~>
The packages you have installed will only be available once the local environment has been activated. So, when running code that requires these packages, you must first activate the environment, by adding the activation command to the submission script, as shown below.
#!/bin/bash --login
#SBATCH --job-name=myvenv
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=64
#SBATCH --cpus-per-task=2
#SBATCH --time=00:10:00
#SBATCH --account=[budget code]
#SBATCH --partition=standard
#SBATCH --qos=standard
source /work/t01/t01/auser/myvenv/bin/activate
export SRUN_CPUS_PER_TASK=${SLURM_CPUS_PER_TASK}
srun --distribution=block:block --hint=nomultithread python myvenv-script.py
Tip
If you find that a module you've installed to a virtual environment on /work
isn't found when running a job, it may be that it was previously installed to the default location of $HOME/.local
which is not mounted on the compute nodes. This can be an issue as pip
will reuse any modules found at this default location rather than reinstall them into a virtual environment. Thus, even if the virtual environment is on /work
, a module you've asked for may actually be located on /home
.
You can check a module's install location and its dependencies with pip show
, for example pip show matplotlib
. You may then run pip uninstall matplotlib
while no virtual environment is active to uninstall it from $HOME/.local
, and then re-run pip install matplotlib
while your virtual environment on /work
is active to reinstall it there. You will need to do this for any modules installed on /home
that will use either directly or indirectly. Remember you can check all your installed modules with pip list
.
Extending ML modules with your own packages via pip
The environment being extended does not have to come from one of the centrally-installed cray-python
modules.
You can also create a local virtual environment based on one of the Machine Learning (ML) modules, e.g., tensorflow
or pytorch
. One extra command is required; it is issued immediately after the python -m venv ...
command.
extend-venv-activate /work/t01/t01/auser/myvenv
The extend-venv-activate
command merely adds some extra commands to the virtual environment's activate
script,
ensuring that the Python packages will be gathered from the local virtual environment, the ML module and from the
cray-python
base module. All this means you would avoid having to install ML packages within your local area.
Note
The extend-venv-activate
command becomes available (i.e., its location is placed on the path) only when the ML module is loaded.
The ML modules are themselves based on cray-python
. For example, tensorflow/2.12.0
is based on the cray-python/3.9.13.1
module.
Conda on ARCHER2
Conda-based Python distributions (e.g. Anaconda, Mamba, Miniconda) are an extremely popular way of installing and accessing software on many systems, including ARCHER2. Although conda-based distributions can be used on ARCHER2, care is needed in how they are installed and configured so that the installation does not adversely effect your use of ARCHER2. In particular, you should be careful of:
- Where you install conda on ARCHER2
- Conda additions to shell configuration files such as
.bashrc
We cover each of these points in more detail below.
Conda install location
If you only need to use the files and executables from your conda installation on the login and data analysis nodes
(via the serial
QoS) then the best place to install conda is in your home directory structure - this will usually
be the default install location provided by the installation script.
If you need to access the files and executables from conda on the compute nodes then you will need to install to a different location as the home file systems are not available on the compute nodes. The work file systems are not well suited to hosting Python software natively due to the way in which file access work, particularly during Python startup. There are two main options for using conda from ARCHER2 compute nodes:
- Use a conda container image
- Install conda on the solid state storage
Use a conda container image
You can pull official conda-based container images from Dockerhub that you can use if you want just the standard set of Python modules that come with the distribution. For example, to get the latest Anaconda distribution as a Singularity container image on the ARCHER2 work file system, you would use (on an ARCHER2 login node, from the directory on the work file system where you want to store the container image):
singularity build anaconda3.sif docker://continuumio/anaconda3
Once you have the container image, you can run scripts in it with a command like:
singularity exec -B $PWD anaconda3.sif python my_script.py
As the container image is a single large file, you end up doing a single large read from the work file system rather than lots of small reads of individual Python files, this improves the performance of Python and reduces the detrimental impact on the wider file system performance for all users.
We have pre-built a Singularity container with the Anaconda distribution in on
ARCHER2. Users can access it at $EPCC_SINGULARITY_DIR/anaconda3.sif
. To run a Python
script with the centrally-installed image, you can use:
singularity exec -B $PWD $EPCC_SINGULARITY_DIR/anaconda3.sif python my_script.py
If you want additional packages that are not available in the standard container images then you will need to build your own container images. If you need help to do this, then please contact the ARCHER2 Service Desk
Conda addtions to shell configuration files
During the install process most conda-based distributions will ask a question like:
Do you wish the installer to initialize Miniconda3 by running conda init?
If you are installing to the ARCHER2 work directories or the solid state storage, you should answer "no" to this question.
Adding the initialisation to shell startup scripts (typically .bashrc
) means that every time
you login to ARCHER2, the conda environment will try to initialise by reading lots of files
within the conda installation. This approach was designed for the case where a user has installed
conda on their personal device and so is the only user of the file system. For shared file systems
such as those on ARCHER2, this places a large load on the file system and will lead to you seeing
slow login times and slow response from your command line on ARCHER2. It will also lead to degraded
read/write performance from the work file systems for you and other users so should be avoided at
all costs.
If you have previously installed a conda distribution and answered "yes" to the question about
adding the initialisation to shell configuration files, you should edit your ~/.bashrc
file
to remove the conda initialisation entries. This means deleting the lines that look something
like:
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/work/t01/t01/auser/miniconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/work/t01/t01/auser/miniconda3/etc/profile.d/conda.sh" ]; then
. "/work/t01/t01/auser/miniconda3/etc/profile.d/conda.sh"
else
export PATH="/work/t01/t01/auser/miniconda3/bin:$PATH"
fi
fi
unset __conda_setup
# <<< conda initialize <<<
Running Python
Example serial Python submission script
#!/bin/bash --login
#SBATCH --job-name=python_test
#SBATCH --ntasks=1
#SBATCH --time=00:10:00
# Replace [budget code] below with your project code (e.g. t01)
#SBATCH --account=[budget code]
#SBATCH --partition=serial
#SBATCH --qos=serial
# Load the Python module, ...
module load cray-python
# ..., or, if using local virtual environment
source <<path to virtual environment>>/bin/activate
# Run your Python program
python python_test.py
Example mpi4py job submission script
Programs that have been parallelised with mpi4py can be run on the ARCHER2 compute nodes.
Unlike the serial Python submission script however, we must launch the Python interpreter
using srun
. Failing to do so will result in Python running a single MPI rank only.
#!/bin/bash --login
# Slurm job options (job-name, compute nodes, job time)
#SBATCH --job-name=mpi4py_test
#SBATCH --nodes=2
#SBATCH --ntasks-per-node=128
#SBATCH --cpus-per-task=1
#SBATCH --time=0:10:0
# Replace [budget code] below with your budget code (e.g. t01)
#SBATCH --account=[budget code]
#SBATCH --partition=standard
#SBATCH --qos=standard
# Load the Python module, ...
module load cray-python
# ..., or, if using local virtual environment
source <<path to virtual environment>>/bin/activate
# Pass cpus-per-task setting to srun
export SRUN_CPUS_PER_TASK=${SLURM_CPUS_PER_TASK}
# Run your Python program
# Note that srun MUST be used to wrap the call to python,
# otherwise your code will run serially
srun --distribution=block:block --hint=nomultithread python mpi4py_test.py
Tip
If you have installed your own packages you will need to activate your local Python environment within your job submission script as shown at the end of Installing your own Python packages (with pip).
By default, mpi4py will use the Cray MPICH OFI library. If one wishes to use UCX instead, you must first,
within the submission script, load PrgEnv-gnu
before loading the UCX modules, as shown below.
module load PrgEnv-gnu
module load craype-network-ucx
module load cray-mpich-ucx
module load cray-python
Running Python at scale
The file system metadata server may become overloaded when running a parallel Python script over many fully populated nodes (i.e., 128 MPI ranks per node). Performance degrades due to the IO operations that accompany a high volume of Python import statements. Typically, each import will first require the module or library to be located by searching a number of file paths before the module is loaded into memory. Such a workload scales as Np x Nlib x Npath , where Np is the number of parallel processes, Nlib is the number of libraries imported and Npath the number of file paths searched. And so, in this way much time can be lost during the initial phase of a large Python job, not to mention the fact that the IO contention will be impacting other users of the system.
Spindle is a tool for improving the library-loading performance of dynamically linked HPC applications. It provides a mechanism for scalable loading of shared libraries, executables and Python files from a shared file system at scale without turning the file system into a bottleneck. This is achieved by caching libraries or their locations within node memory. Spindle takes a pure user-space approach: users do not need to configure new file systems, load particular OS kernels or build special system components. The tool operates on existing binaries — no application modification or special build flags are required.
The script below shows how to run Spindle with your Python code.
#!/bin/bash --login
#SBATCH --nodes=256
#SBATCH --ntasks-per-node=128
...
module load cray-python
module load spindle/0.13
export SRUN_CPUS_PER_TASK=${SLURM_CPUS_PER_TASK}
spindle --slurm --python-prefix=/opt/cray/pe/python/${CRAY_PYTHON_LEVEL} \
srun --overlap --distribution=block:block --hint=nomultithread \
python mpi4py_script.py
The --python-prefix
argument can be set to a list of colon-separated paths if necessary. In the example above,
the CRAY_PYTHON_LEVEL
environment variable is set as a conseqeunce of loading cray-python
.
Note
The srun --overlap
option is required for Spindle as the version of Slurm on ARCHER2 is newer than 20.11.
Using JupyterLab on ARCHER2
It is possible to view and run Jupyter notebooks from both login nodes and compute nodes on ARCHER2.
Note
You can test such notebooks on the login nodes, but please do not attempt to run any computationally intensive work. Jobs may get killed once they hit a CPU limit on login nodes.
Please follow these steps.
-
Install JupyterLab in your work directory.
module load cray-python export PYTHONUSERBASE=/work/t01/t01/auser/.local export PATH=$PYTHONUSERBASE/bin:$PATH # source <<path to virtual environment>>/bin/activate # If using a virtualenvironment uncomment this line and remove the --user flag from the next pip install --user jupyterlab
-
If you want to test JupyterLab on the login node please go straight to step 3. To run your Jupyter notebook on a compute node, you first need to run an interactive session.
Your prompt will change to something like below.srun --nodes=1 --exclusive --time=00:20:00 --account=<your_budget> \ --partition=standard --qos=short --reservation=shortqos \ --pty /bin/bash
In this case, the node id isauser@nid001015:/tmp>
nid001015
. Now execute the following on the compute node.cd /work/t01/t01/auser # Update the path to your work directory export PYTHONUSERBASE=$(pwd)/.local export PATH=$PYTHONUSERBASE/bin:$PATH export HOME=$(pwd) module load cray-python # source <<path to virtual environment>>/bin/activate # If using a virtualenvironment uncomment this line
-
Run the JupyterLab server.
Once it's started, you will see a URL printed in the terminal window of the formexport JUPYTER_RUNTIME_DIR=$(pwd) jupyter lab --ip=0.0.0.0 --no-browser
http://127.0.0.1:<port_number>/lab?token=<string>
; we'll need this URL for step 6. -
Please skip this step if you are connecting from a machine running Windows. Open a new terminal window on your laptop and run the following command.
wheressh <username>@login.archer2.ac.uk -L<port_number>:<node_id>:<port_number>
<username>
is your username, and<node_id>
is the id of the node you're currently on (for a login node, this will beln01
, or similar; on a compute node, it will be a mix of numbers and letters). In our example,<node_id>
isnid001015
. Note, please use the same port number as that shown in the URL of step 3. This number may vary, likely values are 8888 or 8889. -
Please skip this step if you are connecting from Linux or macOS. If you are connecting from Windows, you should use MobaXterm to configure an SSH tunnel as follows.
- Click on the
Tunnelling
button above the MobaXterm terminal. Create a new tunnel by clicking onNew SSH tunnel
in the window that opens. - In the new window that opens, make sure the
Local port forwarding
radio button is selected. - In the
forwarded port
text box on the left underMy computer with MobaXterm
, enter the port number indicated in the JupyterLab server output (e.g., 8888 or 8890). - In the three text boxes on the bottom right under
SSH server
enterlogin.archer2.ac.uk
, your ARCHER2 username and then22
. - At the top right, under
Remote server
, enter the id of the login or compute node running the JupyterLab server and the associated port number. - Click on the
Save
button. - In the tunnelling window, you will now see a new row for the settings you just entered. If you like, you can give a name to the tunnel in the leftmost column to identify it.
- Click on the small key icon close to the right for the new connection to tell MobaXterm which SSH private key to use when connecting to ARCHER2. You should tell it to use the same
.ppk
private key that you normally use when connecting to ARCHER2. - The tunnel should now be configured. Click on the small start button (like a play '>' icon) for the new tunnel to open it. You'll be asked to enter your ARCHER2 account password -- please do so.
- Click on the
-
Now, if you open a browser window locally, you should be able to navigate to the URL from step 3, and this should display the JupyterLab server. If JupyterLab is running on a compute node, the notebook will be available for the length of the interactive session you have requested.
Warning
Please do not use the other http address given by the JupyterLab output,
the one formatted http://<node_id>:<port_number>/lab?token=<string>
. Your local
browser will not recognise the <node_id>
part of the address.
Using Dask Job-Queue on ARCHER2
The Dask-jobqueue project makes it easy to deploy Dask on ARCHER2. You can find more information in the Dask Job-Queue documentation.
Please follow these steps:
- Install Dask-Jobqueue
module load cray-python
export PYTHONUSERBASE=/work/t01/t01/auser/.local
export PATH=$PYTHONUSERBASE/bin:$PATH
pip install --user dask-jobqueue --upgrade
- Using Dask
Dask-jobqueue creates a Dask Scheduler in the Python process where the cluster object is instantiated. A script for running dask jobs on ARCHER2 might look something like this:
from dask_jobqueue import SLURMCluster
cluster = SLURMCluster(cores=128,
processes=16,
memory='256GB',
queue='standard',
header_skip=['--mem'],
job_extra=['--qos="standard"'],
python='srun python',
project='z19',
walltime="01:00:00",
shebang="#!/bin/bash --login",
local_directory='$PWD',
interface='hsn0',
env_extra=['module load cray-python',
'export PYTHONUSERBASE=/work/t01/t01/auser/.local/',
'export PATH=$PYTHONUSERBASE/bin:$PATH',
'export PYTHONPATH=$PYTHONUSERBASE/lib/python3.8/site-packages:$PYTHONPATH'])
cluster.scale(jobs=2) # Deploy two single-node jobs
from dask.distributed import Client
client = Client(cluster) # Connect this local process to remote workers
# wait for jobs to arrive, depending on the queue, this may take some time
import dask.array as da
x = … # Dask commands now use these distributed resources
This script can be run on the login nodes and it submits the Dask jobs to the job queue. Users should ensure that the computationally intensive work is done with the Dask commands which run on the compute nodes.
The cluster object parameters specify the characteristics for running on a single compute node. The header_skip option is required as we are running on exclusive nodes where you should not specify the memory requirements, however Dask requires you to supply this option.
Jobs are be deployed with the cluster.scale command, where the jobs option sets
the number of single node jobs requested. Job scripts are generated
(from the cluster object) and these are submitted to the queue to begin
running once the resources are available. You can check the status of the jobs by
running squeue -u $USER
in a separate terminal.
If you wish to see the generated job script you can use:
print(cluster.job_script())