SLURM Reference Manual for HP XC System Software

ATTACH.
You can monitor or intervene in an already running SRUN job, either batch (started with -b) or
interactive ("allocated," started with -A), by executing SRUN again and "attaching" (-a, lowercase)
to that job. For example,
srun -a 6543 -j
forwards the standard output and error messages from the running job with SLURM ID 6543 to the
attaching SRUN to reveal the job's current status, and (with -j, lowercase) also "joins" the job so that
you can send it signals as if this SRUN had initiated the job. Omit -j for read-only attachments.
Because you are attaching to a running job whose resources have already been allocated, SRUN's
resource-allocation options (such as -N) are incompatible with -a.
BATCH (WITH LCRM).
On machines where LC's metabatch job-control and accounting system LCRM/DPCS is installed,
you can submit (with the LCRM utility PSUB) a script to LCRM that contains (simple) SRUN
commands within it to execute parallel jobs later, after LCRM applies the usual fair-share scheduling
process to your job and its competitors. Here LCRM takes the place of SRUN's -b option for indirect,
across-machine job-queue management.
SRUN SIGNAL HANDLING.
Signals sent to SRUN are automatically forwarded to the tasks that SRUN controls, with a few special
cases. SRUN handles the sequence CTRL-C differently depending on how many it receives in one second:
CTRL-Cs within one second
-------------------------
First reports the state of all tasks
associated with SRUN.
Second sends SIGINT signal to all
associated SRUN tasks.
Third terminates the job at once,
without waiting for remote tasks
to exit.
MPI SUPPORT.
On computer clusters with a Quadrics interconnect among the nodes (such as Lilac on SCF, or Thunder
and ALC on OCF) SRUN directly supports the Quadrics version of MPI without modication. Applications
built using the Quadrics MPI library communicate over the Quadrics interconnect without any special
SRUN options.
You may also use MPICH on any computer where it is available. MPIRUN will, however, need
information on its command line identifying the resources to use, namely
-np SLURM_NPROCS -machinefile filename
where SLURM_NPROCS is the environment variable that contains the (-n) number of processors to use
and lename lists the names of the nodes on which to execute (the captured output from /bin/hostname
run across those nodes with simple SRUN). Sometimes the MPICH vendor congures these options
automatically. See also SRUN's --mpi "working features" option (page 26).
SLURM Reference Manual - 18