SlurmBestPractices

From neicext
Jump to navigation Jump to search


This page contains advice and best practices on how to run and configure Slurm as the LRMS for an NDGF T1 site, gathered from Slurm experts at the Nordic Tier 1 sites.


SLURM configuration tips and tricks

Slides from HEPIX: here


Memory

Slurm operates with a set of different memory measurements:

  • VSS: Virtual Set Size. The size of the total accessible address space for the process.
  • RSS: Resident Set Size. Total size of data held in RAM. It does not include pages that are swapped out, but does include shared memory.
  • PSS: Proportional Set Size. Similar to RSS, but size of shared memory is split between the sharing processes
  • USS: User Set Size. Like RSS, but only takes the process private data into consideration.

Slurm uses RSS to gauge memory limits even when you use cgroups. This is a problem for Atlas multicore jobs. They load a lot of data that is shared between the processes and get killed, even though they are actually not using more memory than the limit. Slurm should include the option to use PSS instead from version 15.08.

Backfill

Slurm is a strict scheduler. The backfill scheduler is cancelled on almost any state change (new job submitted, node change, etc.). This can be problematic on large cluster with events practically continuous. An option is to only cancel the backfill scheduler on a few or one state change e.g. job ending.