Debuging and Interactive Slurm Jobs
Interactive Jobs, and Testing
It is sometimes useful to test out some commands, prototype code or debug before submitting production jobs to the cluster, therefore slurm provides a different way of interacting with the compute nodes, this is through interactive jobs that allow you to run a terminal on the compute node when the job starts running.
Submitting Interactive Jobs
An interactive job can be submitting with the following srun command from slurm, rather than the usua sbatch command:
$ srun --time=00:30:00 --gres=gpu:1 --partition=gpu1 --pty /bin/bash -i
this command with block the terminal until the job starts execution, so you will need to wait for this if the queued jobs are ahead of your interactive job.
Once it starts running, you can interact with your code from the terminal as usual, you will notice you are not logged into the login node, but into a compute node
$ module av
$ module load applications/gpu/python/conda-25.1.1-python-3.9.21
$ conda env list
base /opt/ohpc/pub/conda/instdir
python-3.9.21 /opt/ohpc/pub/conda/instdir/envs/python-3.9.21
$ conda activate python-3.9.21
$ mkdir mnist
$ cd mnist
$ wget https://raw.githubusercontent.com/pytorch/examples/refs/heads/main/mnist/main.py
$ python main.py
...
$ exit
You can exit the interactive job session with the exit command. This will also log you out of the compute node, and back to
the login node.
Watch Interactive Jobs Demo
Next: Advanced_Usage
Up: HPC_Usage