SLURM Reference Manual for HP XC System Software

SLURM Operation
SLURM Utilities
SLURM's ve command-line utilities provide its direct interface for users (while LCRM utilities, as
explained in EZJOBCONTROL (URL: http://www.llnl.gov/LCdocs/ezjob), provide an indirect interface).
These utilities are:
SRUN submits jobs to run under SLURM management. SRUN can
(A) submit a batch job and then terminate, or
(B) submit an interactive job and then persist to shepherd the job as it runs, or
(C) allocate resources to a shell and then spawn that shell for use in running subordinate
jobs.
SLURM associates every set of parallel tasks ("job steps") with the SRUN instance
that initiated that set, and SRUN gives you elaborate control over node choice and
I/O redirection for your parallel job. (Job steps are not supported on BlueGene/L.)
SQUEUE displays (by default) the queue of running and waiting jobs (or "job steps"), including
the JobId (used for SCANCEL), and the nodes assigned to each running job. But you
can customize SQUEUE reports to cover any of 24 different job properties, sorted
by the properties most important to you.
SINFO displays a summary of status information on SLURM-managed partitions and nodes
(not jobs). Customizable SINFO reports can cover the node count, state, and name
list for a whole partition, or the CPUs, memory, disk space, or current status for
individual specied nodes.
SCANCEL cancels a running or waiting job, or sends a specied signal to all processes on all
nodes associated with a job (only job owners or their administrators can cancel their
jobs).
SCONTROL (privileged users only) manages available nodes (for example, by "draining" jobs
from a node or partition to prepare it for servicing) and assigns properties to node
partitions.
SMAP (on BlueGene/L only) displays a character-based chart or plot showing how nodes
are allocated geometrically among current jobs and BG/L partitions (a job-planning
tool).
SLURM Reference Manual - 16