Slurm Configuration Issue: Running a Process Blocks Entire Node's Cores Instead of Allocated Cores

25 Views Asked by Alexandre Gràcia Calvo At 22 January 2024 at 12:14

I'm encountering an issue while configuring Slurm in my distributed computing environment. When I launch a process that should only use 4 cores, it ends up blocking all 128 available cores on the node, preventing me from using them for other tasks.

My Slurm configuration includes resource allocation using directives like --nodes, --ntasks, and --cpus-per-task. Despite this, it seems that the process is occupying all cores on the node instead of adhering to the specific allocation.

Any ideas on why this might be happening or any additional configuration I should be mindful of to prevent a process from occupying all the cores on the node?

I appreciate any guidance or suggestions to resolve this issue. Thank you!

Original Q&A

There are 1 best solutions below

damienfrancois On 24 January 2024 at 11:18

The two main reasons why this could be happening are often either (a) you implicitly/unknowingly request all resources of a compute node, or (b) the cluster is configured not to share compute nodes.

Regarding (a), often, the memory requirement is the culprit. If the cluster or the partition is configured with DefMemPerCPU or DefMemPerNode and you do not overwrite it in your submission script, you will prevent other jobs from using the node. Also make sure your environment does not contain variables that influence the resource allocation, e.g. $SBATCH_EXCLUSIVE.

Regarding (b), check that SelectType is not select/linear and that the partition configuration do not have OverSubscribe=EXCLUSIVE. If either is set, the nodes are not sharable between jobs.

Slurm Configuration Issue: Running a Process Blocks Entire Node's Cores Instead of Allocated Cores

There are 1 best solutions below

Related Questions in PARALLEL-PROCESSING

Related Questions in SLURM

Related Questions in HPC

Related Questions in SBATCH

Trending Questions

Popular # Hahtags

Popular Questions