SLURM Reference Manual for HP XC System Software

Preface
Scope: This manual explains the design goals and unique roles of LC's locally developed
Simple Linux Utility for Resource Management (SLURM), intended as a customized
replacement for RMS or NQS in allocating compute resources (mostly nodes) to
queued jobs on machines running the CHAOS operating system. Sections describe
the features of both control daemon SLURMCTLD and local daemon SLURMD, as
well as SLURM's adaptability by means of plugin modules. The ve SLURM user
utilities for querying and controlling jobs managed by SLURM are also introduced.
The features and options of SRUN, the tool used to launch both parallel interactive
and batch jobs under SLURM management, receive especially detailed treatment.
Another section explains how to monitor SRUN-submitted jobs by using SQUEUE,
as well as how to customize SQUEUE's reports using its own format specication
language. Likewise, a section tells how to check the current status of nodes
(individually or by partition) using SINFO. The general-user features of SCONTROL
are also included.
Availability: SLURM is part of the CHAOS project, and is available on selected large LC clusters
that run the CHAOS version of Linux.
Consultant: For help contact the LC customer service and support hotline at 925-422-4531 (open
e-mail: lc-hotline@llnl.gov, SCF e-mail: lc-hotline@pop.llnl.gov).
Printing: The print le for this document can be found at
OCF: http://www.llnl.gov/LCdocs/slurm/slurm.pdf
SCF: https://lc.llnl.gov/LCdocs/slurm/slurm_scf.pdf
SLURM Reference Manual - 4