Slurm

Introduction

Slurm [1] is a workload manager for clusters, offering both batch and interactive job scheduling. It works over a text based interface on the linux terminal.

Slurm will provide you with the following to help you make use of the cluster;

What resources are available on the cluster.
Queuing and allocation of jobs based on specified resources.
Job monitoring and status reporting.

These commands include:


 $ sinfo : to view the cluster, resources and partition
 $ squeue : view submitted job.
 $ sbatch : submit a batch job.
 $ sacct : for admins
 $ scancel :  to cancel your own job that has been submitted.

Together with these commands, a job submission script can be provided to slurm to set a jobs parameters. Practical usage examples will be illustrated in the subsequent pages.

Watch Template:Font-size Template:Huge

Quality of Service and Limitations

Users of CPU resources have zero access to the GPU resources, and are confined to CPU resources.


$ sacctmgr show qos  format=Name,Priority,GrpTRES,MaxTRES,MaxTRESMins

     Name          Priority       GrpTRES       MaxTRES   MaxTRESMins 
    ---------- ---------- ------------- ------------- ------------- 
      normal           0                                           
     gpu_only          0                       gres/gpu=2               
     cpu_only          0                       gres/gpu=0

gpu_only and cpu_only are Slurm partitions (partitions are to Slurm what queues are to PBS torque)

Next: Basic_Usage:_CPU_Based_Resources_With_Slurm

Up: HPC_Usage

Slurm

Introduction

Quality of Service and Limitations

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools