Difference between revisions of "Slurm"
From KENET Training
Line 21: | Line 21: | ||
parameters. Practical usage examples will be illustrated in the subsequent pages. | parameters. Practical usage examples will be illustrated in the subsequent pages. | ||
− | [[https://asciinema.org/a/FsZFGQQBRcRulln07btPWUR99| Watch Demo ]] | + | [[https://asciinema.org/a/FsZFGQQBRcRulln07btPWUR99 | Watch Demo ]] |
== Quality of Service and Limitations == | == Quality of Service and Limitations == |
Revision as of 12:46, 8 May 2025
Introduction
Slurm [1] is a workload manager for clusters, offering both batch and interactive job scheduling. It works over a text based interface on the linux terminal.
Slurm will provide you with the following to help you make use of the cluster;
- What resources are available on the cluster.
- Queuing and allocation of jobs based on specified resources.
- Job monitoring and status reporting.
These commands include:
$ sinfo : to view the cluster, resources and partition
$ squeue : view submitted job.
$ sbatch : submit a batch job.
$ sacct : for admins
$ scancel : to cancel your own job that has been submitted.
Together with these commands, a job submission script can be provided to slurm to set a jobs
parameters. Practical usage examples will be illustrated in the subsequent pages.
[| Watch Demo ]
Quality of Service and Limitations
Users of CPU resources have zero access to the GPU resources, and are confined to CPU resources.
$ sacctmgr show qos format=Name,Priority,GrpTRES,MaxTRES,MaxTRESMins
Name Priority GrpTRES MaxTRES MaxTRESMins
---------- ---------- ------------- ------------- -------------
normal 0
gpu_only 0 gres/gpu=2
cpu_only 0 gres/gpu=0
gpu_only and cpu_only are Slurm partitions (partitions are to Slurm what queues are to PBS torque)
Next: Basic_Usage:_CPU_Based_Resources_With_Slurm
Up: HPC_Usage