SLURM Reference Manual for HP XC System Software

SLURMD
The SLURMD daemon runs on every compute node of every cluster that SLURM manages and it
performs the lowest level work of resource management. Like SLURMCTLD (above), SLURMD is
multi-threaded for efciency, but unlike SLURMCTLD it runs with root privilege (so it can initiate jobs
on behalf of other users).
SLURMD carries out ve key tasks and has ve corresponding subsystems:
Machine Status
responds to SLURMCTLD requests for machine state information and sends
asynchronous reports of state changes to help with queue control.
Job Status responds to SLURMCTLD requests for job state information and sends asynchronous
reports of state changes to help with queue control.
Remote Execution
starts, monitors, and cleans up after a set of processes (usually shared by a parallel
job), as decided by SLURMCTLD (or by direct user intervention). This often involves
many process-limit, environment-variable, working-directory, and user-id changes.
Stream Copy Service
handles all STDERR, STDIN, and STDOUT for remote tasks. This may involve
redirection, and it always involves locally buffering job output to avoid blocking local
tasks.
Job Control propagates signals and job-termination requests to any SLURM-managed processes
(often interacting with the Remote Execution subsystem).
SLURM Reference Manual - 12