Quickstart for users
This guide aims to quickly enable new users to get up and running on ARCHER2. It covers the process of getting an ARCHER2 account, logging in and running your first job.
Request an account on ARCHER2
You need to use both a password and a passphrase-protected SSH key pair to log into ARCHER2. You get the password from SAFE, but, you will also need to setup your own SSH key pair and add the public part to your account via SAFE before you will be able to log in. We cover the authentication steps below.
Obtain an account on the SAFE website
The first step is to sign up for an account on the ARCHER2 SAFE website. The SAFE account is used to manage all of your login accounts, allowing you to report on your usage and quotas. To do this:
- Go to the SAFE New User Signup Form
- Fill in your personal details. You can come back later and change them if you wish
- Click Submit
You are now registered. Your SAFE password will be emailed to the email address you provided. You can then login with that email address and password. (You can change your initial SAFE password whenever you want by selecting the Change SAFE password option from the Your details menu.)
Request an ARCHER2 login account
Once you have a SAFE account and an SSH key you will need to request a user account on ARCHER2 itself. To do this you will require a Project Code; you usually obtain this from the Principle Investigator (PI) or project manager for the project you will be working on. Once you have the Project Code:
- Log into SAFE
- Use the Login accounts - Request new account menu item
- Select the correct project from the drop down list
- Select the archer2-4c machine in the list of available machines
- Click Next
- Enter a username for the account and (optionally) an SSH public
- If you do not specify an SSH key at this stage, your default key will be used (if you have one). For users who had an ARCHER account, the default key will be your ARCHER SSH key.
- You can always add an SSH key (or additional SSH keys) using the process described below.
- Click Request
The PI or project manager of the project will be asked to approve your request. After your request has been approved the account will be created and when this has been done you will receive an email. You can then come back to SAFE and pick up the initial single-use password for your new account.
ARCHER2 account passwords are also sometimes referred to as LDAP passwords by the system.
Generating and adding an SSH key pair
How you generate your SSH key pair depends on which operating system you use and which SSH client you use to connect to ARCHER2. We will not cover the details on generating an SSH key pair here, but detailed information on this topic is available in the ARCHER2 User and Best Practice Guide.
After generating your SSH key pair, add the public part to your login account using SAFE:
- Log into SAFE
- Use the menu Login accounts and select the ARCHER2 account to be associated with the SSH key
- On the subsequent Login account details page, click the Add Credential button
- Select SSH public key as the Credential Type and click Next
- Either copy and paste the public part of your SSH key into the SSH Public key box or use the button to select the public key file on your computer
- Click Add to associate the public SSH key part with your account
Once you have done this, your SSH key will be added to your ARCHER2 account.
Remember, you will need to use both an SSH key and password to log into ARCHER2 so you will also need to collect your initial password before you can log into ARCHER2 for the first time. We cover this next.
If you want to connect to ARCHER2 from more than one machine, e.g. from your home laptop as well as your work laptop, you should generate an ssh key on each machine, and add each of the public keys into SAFE.
Collecting your ARCHER2 password
You should now collect your ARCHER2 password:
- Log into SAFE
- Use the Login accounts menu to select your new login account
- Use the View Login Account Password button to view your single-use ARCHER2 password
This password is generated randomly by the software. It's best to copy-and-paste it across when you log in to ARCHER2. After you login, you will immediately be prompted to begin the process of changing your password. You should now enter the initial password again, and then you will be prompted for your new, easy-to-remember password. Your new password should conform to the ARCHER2 Password Policy.
The View Login Account Password option within SAFE will continue to display your old initial password. Your SAFE account has no knowledge of your new machine account password.
Login to ARCHER2
To log into ARCHER2 you should use the
You will first be prompted for your machine account password. Once you have entered your password successfully, you will then be prompted for the passphrase associated with your SSH key pair. You need to enter both credentials correctly to be able to access ARCHER2.
If your SSH key pair is not stored in the default location (usually
~/.ssh/id_rsa) on your local system, you may need to specify the path
to the private part of the key wih the
-i option to
example, if your key is in a file called
keys/id_rsa_archer2 you would
use the command
ssh -i keys/id_rsa_archer2
email@example.com to log in.
When you first log into ARCHER2, you will be prompted to change your initial password. This is a three step process:
- When prompted to enter your ldap password: re-enter the password you retrieved from SAFE
- When prompted to enter your new password: type in a new password
- When prompted to re-enter the new password: re-enter the new password
Your password has now been changed.
More information on connecting to ARCHER2 is available in the Connecting to ARCHER2 section of the User Guide.
File systems and manipulating data
ARCHER2 has a number of different file systems and understanding the difference between them is crucial to being able to use the system. In particular, transferring and moving data often requires a bit of thought in advance to ensure that the data is secure and in a useful form.
ARCHER2 file systems are:
- /home: backed up for disaster recovery purposes only, data recovery for accidental deletion is not supported. NFS is available on login and service nodes.
- /work: not backed-up. Lustre is available on login, service and compute nodes.
Top tips for managing data on ARCHER2:
- Do not generate huge numbers of files (>1000) in a single directory.
- Poor performance relating to file transfer is often due to the number of files involved in the transfer - minimise the number of files that you have to transfer by using archiving tools to improve performance.
- Archive directories or large numbers of files before moving them
between file systems (e.g. by using commands like
- When using
rsyncbetween file systems mounted on ARCHER2 avoid the use of compression options as these can slow performance (time saved by transferring smaller compressed files is usually less than the overhead added by having to compress files on the fly).
- Think about automating the merging and transfer of multiple files output by software on ARCHER2 to other resources. The Data Management Guide linked below provides examples of how to automatically verify the integrity of an archive.
Information on best practice in managing you data is available in the Data management and transfer section of the User Guide.
Software on ARCHER2 is principally accessed through environment modules.
These load and unload the desired compilers, tools and libraries through
module command and its subcommands. Some modules will be loaded by
default on login, providing a default working environment; many more
will be available for use but initially unloaded, allowing you to set up
the environment to suit your needs.
At any stage you can check which modules have been loaded by running
Running the following command will display all environment modules available on ARCHER2, whether loaded or unloaded
The search field for this command may be narrowed by providing the first few characters of the module name being queried. For example, all available versions and variants of VASP may be found by running
module avail vasp
You will see that different versions are available for many modules. For
vasp/6/6.1.0 are two available versions of
VASP. Furthermore, a default version may be specified; this is used if
no version is provided by the user.
VASP is licensed software, as are other software packages on ARCHER2. You must have a valid licence to use licensed software on ARCHER2. Often you will need to request access through the SAFE. More on this below.
module load and
module add commands perform the same action,
loading a module for use. Following the above,
module load vasp/5
would load the default version of VASP 5, while
module load vasp/5/5.4.4
would specifically load version
5.4.4. A loaded module may be unloaded
through the identical
module remove or
delete commands, e.g.
module unload vasp
The above unloads whichever version of VASP is currently in the environment. Rather than issuing separate unload and load commands, versions of a module may be swapped as follows:
module swap vasp vasp/5/5.4.4
Other helpful commands are:
module help <modulename>which provides a short description of the module
module show <modulename>which displays the contents of the modulefile
Points to be aware of include:
- Some modules will conflict with others. A simple example would be
the conflict arising when trying to load a different version of an
already loaded module. When a conflict occurs, the loading process
will fail and an error message will be displayed. Examination of the
message and the module output (via
module show) should reveal the cause of the conflict and how to resolve it.
- The order in which modules are loaded can matter. Consider two
modules which set the same variable to a different value. The final
value would be that set by the module which loaded last. If you
suspect that two modules may be interfering with one another, you
can examine their contents with
Requesting access to licensed software
Some of the software installed on ARCHER2 requires a user to have a valid licence agreed with the software owners/developers to be able to use it (for example, VASP). Although you will be able to load this software on ARCHER2, you will be barred from actually using it until your licence has been verified.
You request access to licensed software through the EPCC SAFE (the web administration tool you used to apply for your account and retrieve your initial password) by being added to the appropriate Package Group. To request access to licensed software:
- Log in to SAFE
- Go to the Menu Login accounts and select the login account which requires access to the software
- Click New Package Group Request
- Select the software from the list of available packages and click Select Package Group
- Fill in as much information as possible about your license; at the very least provide the information requested at the top of the screen such as the licence holder's name and contact details. If you are covered by the license because the licence holder is your supervisor, for example, please state this.
- Click Submit
Your request will then be processed by the ARCHER2 Service Desk who will confirm your license with the software owners/developers before enabling your access to the software on ARCHER2. This can take several days (depending on how quickly the software owners/developers take to respond) but you will be advised once this has been done.
Create a job submission script
To run a program on the ARCHER2 compute nodes you need to write a job
submission script that tells the system how many compute nodes you want
to reserve and for how long. You also need to use the
srun command to
launch your parallel executable.
For a more details on the Slurm scheduler on ARCHER2 and writing job submission scripts see the Running jobs on ARCHER2 section of the User and Best Practice Guide.
Parallel jobs on ARCHER2 should be run from the /work file system as
/home is not available on the compute nodes - you will see a
file not found error if you try to run a job from the /home file
Create a job submission script called
submit.slurm in your space on
the work file system using your favourite text editor. For example,
auser@uan01:~> cd /work/t01/t01/auser auser@uan01:/work/t01/t01/auser> vim submit.slurm
You will need to use your project code and username to get to the
correct directory. i.e. replace the
above with your project code and replace the username
auser with your ARCHER2 username.
Paste the following text into your job submission script, replacing
ENTER_YOUR_BUDGET_CODE_HERE with your budget code e.g.
ENTER_PARTITION_HERE with the partition you wish to run on (e.g
ENTER_QOS_HERE with the quality of service you want
#!/bin/bash --login #SBATCH --job-name=test_job #SBATCH --nodes=1 #SBATCH --tasks-per-node=128 #SBATCH --cpus-per-task=1 #SBATCH --time=0:5:0 # Replace [budget code] below with your project code (e.g. t01) #SBATCH --account=[budget code] #SBATCH --partition=standard #SBATCH --qos=standard # Setup the batch environment module load epcc-job-env # Load the xthi module to get access to the xthi program module load xthi # srun launches the parallel program based on the SBATCH options srun --distribution=block:block --hint=nomultithread xthi
Submit your job to the queue
You submit your job to the queues using the
auser@uan01:/work/t01/t01/auser> sbatch submit.slurm Submitted batch job 23996 The value returned is your *Job ID*.
Monitoring your job
You use the
squeue command to examine jobs in the queue. Use:
auser@uan01:/work/t01/t01/auser> squeue -u $USER
To list all the jobs you have in the queue.
squeue on its own
lists all jobs in the queue from all users.
Checking the output from the job
The job submission script above should write the output to a file called
slurm-<jobID>.out (i.e. if the Job ID was 23996, the file would be
slurm-23996.out), you can check the contents of this file with the
cat command. If the job was successful you should see output that
looks something like:
auser@eslogin01:/work/t01/t01/auser> cat slurm-23996.out Node 0, hostname nid001020 Node 0, rank 0, thread 0, (affinity = 0) Node 0, rank 1, thread 0, (affinity = 1) Node 0, rank 2, thread 0, (affinity = 2) Node 0, rank 3, thread 0, (affinity = 3) Node 0, rank 4, thread 0, (affinity = 4) Node 0, rank 5, thread 0, (affinity = 5) Node 0, rank 6, thread 0, (affinity = 6) Node 0, rank 7, thread 0, (affinity = 7) Node 0, rank 8, thread 0, (affinity = 8) Node 0, rank 9, thread 0, (affinity = 9) Node 0, rank 10, thread 0, (affinity = 10) Node 0, rank 11, thread 0, (affinity = 11) Node 0, rank 12, thread 0, (affinity = 12) Node 0, rank 13, thread 0, (affinity = 13) Node 0, rank 14, thread 0, (affinity = 14) Node 0, rank 15, thread 0, (affinity = 15) Node 0, rank 16, thread 0, (affinity = 16) Node 0, rank 17, thread 0, (affinity = 17) Node 0, rank 18, thread 0, (affinity = 18) Node 0, rank 19, thread 0, (affinity = 19) Node 0, rank 20, thread 0, (affinity = 20) Node 0, rank 21, thread 0, (affinity = 21) ... output trimmed ...
If something has gone wrong, you will find any error messages in the file instead of the expected output.
You should use the following phrase to acknowledge ARCHER2 for all research outputs that were generated using the ARCHER2 service:
This work used the ARCHER2 UK National Supercomputing Service (https://www.archer2.ac.uk).
You should also tag outputs with the keyword ARCHER2 whenever possible.
If you plan to compile your own programs on ARCHER2, you may also want
to look at
Other documentation you may find useful: