Start a job on terrabyte
Prerequisites
In order to conduct the following example, you need to have a terrabyte/LRZ account, having set up a Two-Factor-Authentication (2FA) method and are allowed to login into the terrabyte Login-Node (SSH into login.terrabyte.lrz.de
).
1. Prepare a script or executable
Please save the following code in file calculation.py
in the folder ~/slurm_handson_example
.
import sys
import time
# Check if a number is provided as a command-line argument
if len(sys.argv) != 2:
print("ERROR: No input provided! Usage: python example.py <input_number>", file=sys.stderr)
sys.exit(1)
# Get the input number from the command-line argument
try:
user_input = float(sys.argv[1])
except ValueError:
print("ERROR: Invalid input. Please provide a valid number.", file=sys.stderr)
sys.exit(1)
# Perform a simple calculation (and make it a little bit longer ;-) )
result = user_input * 2
time.sleep(60)
# Print the result
print(f"The result of the calculation is: {result}")
2. Prepare a SLURM-job script
To start a SLURM job you need to create a job script. This includes information about the resouces you need, standard output and error files, walltime of the script.
Please save the following code in file calculation_job_script.sh
in the folder ~/slurm_handson_example
. You need to replace <your-path-to-home>
with the absolute and full path to your home directory and <your-email@provider>
with your email address.
#!/bin/bash
#SBATCH -J simple_calculation
#SBATCH -o calculation.out
#SBATCH -e calculation.err
#SBATCH -D <your-path-to-home>/slurm_handson_example/
#SBATCH --clusters=hpda2
#SBATCH --partition=hpda2_test
#SBATCH --cpus-per-task=1
#SBATCH --mem=1gb
#SBATCH --time=00:05:00
#SBATCH --mail-type=all
#SBATCH --mail-user=<your-email@provider>
module load slurm_setup
module load python
python3 calculation.py 5
Are you wondering about all of these SBATCH commands? Please see the following two links for further explanations:
- https://docs.terrabyte.lrz.de/services/terrabyte-hpc/job-submission/slurm/#general-options
- https://slurm.schedmd.com/sbatch.html
3. Send SLURM-job script to scheduler
Now you are ready to send the SLURM job to the SLURM scheduler using the following command from the terrabyte Login-Node:
sbatch ~/slurm_handson_example/calculation_job_script.sh
4. Check job status and logs
With the commands squeue
and sacct
you can check the status and logs of your job.
5. If you want to submit a list of jobs at once, you can submit SLURM job arrays
When specifiying the array
SBATCH argument, SLURM will automatically execute the same command in series. You can specify the calculation of your program by providing the array task id with the variable $SLURM_ARRAY_TASK_ID
as shown below.
By submitting this script to SLURM via the sbatch
command, 10 SLURM jobs will be started.
#!/bin/bash
#SBATCH -J simple_calculation
#SBATCH -o calculation.%a.out
#SBATCH -e calculation.%a.err
#SBATCH -D <your-path-to-home>/slurm_handson_example/
#SBATCH --clusters=hpda2
#SBATCH --partition=hpda2_test
#SBATCH --cpus-per-task=1
#SBATCH --mem=1gb
#SBATCH --time=00:05:00
#SBATCH --mail-type=all
#SBATCH --mail-user=<your-email@provider>
#SBATCH --array=1-10
module load slurm_setup
module load python
python3 calculation.py $SLURM_ARRAY_TASK_ID