Skip to main content

check_cluster

Introduction to check_cluster

check_cluster is commandline tool to visualize the utilization of a slurm cluster. The tool visualizes the usage of nodes per parition and shows the jobs run by the current user. To retrieve the information from slurm, scontrol show nodes and scontrol show jobs are used. The data is refreshed every 10 seconds.

Using check_cluster with Modules

To use check_cluster on the terrabyte HPC system, load the check_cluster module with the following command:

# consider adding the module use line to your ~/.bashrc to always make terrabyte modules available 
module use /dss/dsstbyfs01/pn56su/pn56su-dss-0020/usr/share/modules/files/
module load check_cluster

Once loaded, you can start using check_cluster to monitor the cluster. Below are some examples of common operations:

Usage Examples

Example 1: Check Node Status

To view the status of all nodes in the cluster:

check_cluster