HP XC System Software Administration Guide Version 3.2

primary slurmctld daemon. On returning to service, the primary slurmctld daemon regains
control of the SLURM subsystem from the backup slurmctld daemon.
SLURM offers a set of utilities that provide information about SLURM configuration, state, and
jobs, most notably scontrol, squeue, and sinfo. See scontrol(1), squeue(1), and sinfo(1) for
more information about these utilities.
SLURM enables you to collect and analyze job accounting information. “Configuring Job
Accounting” (page 180) describes how to configure job accounting information on the HP XC
system.
“SLURM Troubleshooting” (page 261) provides SLURM troubleshooting information.
15.2 Configuring SLURM
The HP XC system provides global and local directories for SLURM files:
The /hptc_cluster/slurm directory is the sharable location for SLURM files that need
to be shared between the nodes. The SLURM slurmctld state files, job logging files, and
the slurm.conf configuration file reside there.
The location for SLURM files that should remain local to a given node is /var/slurm; the
files in this directory are not shared between nodes. All SLURM daemon logs and slurmd
state information are stored there.
As a resource manager on an HP XC system, SLURM allocates exclusive or nonexclusive access
to resources on compute nodes for users to perform work; it provides a framework to start,
execute, and monitor work (normally parallel jobs) on the set of allocated nodes.
All SLURM configuration options are set and stored in the
/hptc_cluster/slurm/etc/slurm.conf file; For information about available options, see
slurm.conf(5). The slurm.conf file also contains useful commentary on the purpose of each
setting.
The following SLURM configuration settings are preset statically on HP XC systems:
StateSaveLocation=/hptc_cluster/slurm/state
SlurmdSpoolDir=/var/slurm/state
SlurmUser=slurm
SlurmctldLogFile=/var/slurm/log/slurmctld.log
SlurmdLogFile=/var/slurm/log/slurmd.log
SlurmctldPidFile=/var/slurm/run/slurmctld.pid
SlurmdPidFile=/var/slurm/run/slurmd.pid
AuthType=auth/munge
JobCompType=jobcomp/filetxt
SwitchType=switch/none
JobCompLoc=/hptc_cluster/slurm/job/slurm.job.log
JobAcctType=jobacct/log
JobAcctLoc=/hptc_cluster/slurm/job/jobacct.log
ReturnToService=1
PropagateResourceLimitsExcept=RLIMIT_NPROC
The slurm.conf file must be available on each node of the HP XC system.
Table 15-1 displays the SLURM configuration settings that are set (and, if necessary, adjusted)
during the execution of the cluster_config utility.
Table 15-1 SLURM Configuration Settings
Default Value*Setting
Lowest-numbered resource management node
ControlMachine
Second-lowest resource management node (if available)
BackupController
'slurm'SlurmUser
All compute nodes, plus 'Procs=2'NodeName
170 Managing SLURM