Difference between revisions of "Slurm"

Latest revision as of 14:44, 8 May 2025

Introduction

Slurm [1] is a workload manager for clusters, offering both batch and interactive job scheduling. It works over a text based interface on the linux terminal.

Slurm will provide you with the following to help you make use of the cluster;

What resources are available on the cluster.
Queuing and allocation of jobs based on specified resources.
Job monitoring and status reporting.

These commands include:


 $ sinfo : to view the cluster, resources and partition
 $ squeue : view submitted job.
 $ sbatch : submit a batch job.
 $ sacct : for admins
 $ scancel :  to cancel your own job that has been submitted.

Together with these commands, a job submission script can be provided to slurm to set a jobs parameters. Practical usage examples will be illustrated in the subsequent pages.

Watch Demo

Quality of Service and Limitations

Users of CPU resources have zero access to the GPU resources, and are confined to CPU resources.


$ sacctmgr show qos  format=Name,Priority,GrpTRES,MaxTRES,MaxTRESMins

     Name          Priority       GrpTRES       MaxTRES   MaxTRESMins 
    ---------- ---------- ------------- ------------- ------------- 
     normal            0                                           
     gpu_only          0                      gres/gpu=2               
     cpu_only          0                      gres/gpu=0               
     debug            50                      gres/gpu=1

gpu_only and normal are Slurm QOS parameters.

Next: Basic_Usage:_CPU_Based_Resources_With_Slurm

Up: HPC_Usage

Difference between revisions of "Slurm"

Latest revision as of 14:44, 8 May 2025

Introduction

Watch Demo

Quality of Service and Limitations

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

Tools

@@ Line 1: / Line 1: @@
+[[File:Slurm_logo.svg.png|200px]]
 === Introduction ===
 Slurm [https://slurm.schedmd.com/documentation.html] is a workload manager for clusters, offering both batch and interactive job scheduling.
@@ Line 7: / Line 9: @@
 # Queuing and allocation of jobs based on specified resources.
 # Job monitoring and status reporting.
+These commands include:
+<code bash>
+  $ sinfo : to view the cluster, resources and partition
+  $ squeue : view submitted job.
+  $ sbatch : submit a batch job.
+  $ sacct : for admins
+  $ scancel :  to cancel your own job that has been submitted.
+</code>
+Together with these commands, a job submission script can be provided to slurm to set a jobs
+parameters. Practical usage examples will be illustrated in the subsequent pages.
+==[https://asciinema.org/a/FsZFGQQBRcRulln07btPWUR99  Watch Demo ] ==
+== Quality of Service and Limitations ==
+Users of  '''CPU''' resources have zero access to the GPU resources, and are confined to CPU resources.
+<code bash>
+ $ sacctmgr show qos  format=Name,Priority,GrpTRES,MaxTRES,MaxTRESMins
+      Name          Priority       GrpTRES       MaxTRES   MaxTRESMins
+     ---------- ---------- ------------- ------------- -------------
+      normal            0
+      gpu_only          0                      gres/gpu=2
+      cpu_only          0                      gres/gpu=0
+      debug            50                      gres/gpu=1
+</code>
+'''gpu_only''' and '''normal''' are Slurm QOS parameters.
+Next:
+[[Basic_Usage:_CPU_Based_Resources_With_Slurm|Basic_Usage:_CPU_Based_Resources_With_Slurm]]
+Up:
+[[ HPC_Usage| HPC_Usage]]