HP XC System Software User's Guide Version 3.2

ManualsBrandsHP ManualsSoftwareHP XC System 3.x Software

101

102

103

104

105

106

107

108

109

110

Figure 10-1 How LSF-HPC and SLURM Launch and Manage a Job

N 1 6

N16

User

job_starter.sh

$ srun -nl myscript

$ bsub-n4 -ext ”SLURM[nodes-4]” -o output.out./myscript

LSF Execution Host

lsfhost.localdomain

SLURM_JOBID=53

SLURM_NPROCS=4

$ hostname

hostname

$ hostname

hostname

Compute Node

N16

Compute Node

srun

myscript

$ srun hostname

$ mpirun -srun ./hellompi

1. A user logs in to login node n16.

2. The user executes the following LSF bsub command on login node n16:

$ bsub -n4 -ext "SLURM[nodes=4]" -o output.out ./myscript

This bsub command launches a request for four cores (from the -n4 option of the bsub

command) across four nodes (from the -ext "SLURM[nodes=4]" option); the job is

launched on those cores. The script, myscript, which is shown here, runs the job:

#!/bin/sh

hostname

srun hostname

mpirun -srun ./hellompi

3. LSF-HPC schedules the job and monitors the state of the resources (compute nodes) in the

SLURM lsf partition. When the LSF-HPC scheduler determines that the required resources

are available, LSF-HPC allocates those resources in SLURM and obtains a SLURM job

identifier (jobID) that corresponds to the allocation.

In this example, four cores spread over four nodes (n1,n2,n3,n4) are allocated for myscript,

and the SLURM job id of 53 is assigned to the allocation.

10.8 How LSF-HPC and SLURM Launch and Manage a Job 103