HP XC System Software User's Guide Version 3.2

5.2 Submitting a Serial Job Using LSF-HPC.........................................................................................53
5.2.1 Submitting a Serial Job with the LSF bsub Command............................................................53
5.2.2 Submitting a Serial Job Through SLURM Only......................................................................54
5.3 Submitting a Parallel Job.................................................................................................................55
5.3.1 Submitting a Non-MPI Parallel Job.........................................................................................55
5.3.2 Submitting a Parallel Job That Uses the HP-MPI Message Passing Interface.........................56
5.3.3 Submitting a Parallel Job Using the SLURM External Scheduler...........................................57
5.4 Submitting a Batch Job or Job Script...............................................................................................60
5.5 Submitting Multiple MPI Jobs Across the Same Set of Nodes........................................................62
5.5.1 Using a Script to Submit Multiple Jobs...................................................................................62
5.5.2 Using a Makefile to Submit Multiple Jobs..............................................................................62
5.6 Submitting a Job from a Host Other Than an HP XC Host.............................................................65
5.7 Running Preexecution Programs....................................................................................................65
6 Debugging Applications.............................................................................................67
6.1 Debugging Serial Applications.......................................................................................................67
6.2 Debugging Parallel Applications....................................................................................................67
6.2.1 Debugging with TotalView.....................................................................................................68
6.2.1.1 SSH and TotalView..........................................................................................................68
6.2.1.2 Setting Up TotalView......................................................................................................68
6.2.1.3 Using TotalView with SLURM........................................................................................69
6.2.1.4 Using TotalView with LSF-HPC.....................................................................................69
6.2.1.5 Setting TotalView Preferences.........................................................................................69
6.2.1.6 Debugging an Application..............................................................................................70
6.2.1.7 Debugging Running Applications..................................................................................71
6.2.1.8 Exiting TotalView............................................................................................................71
7 Monitoring Node Activity............................................................................................73
7.1 Installing the Node Activity Monitoring Software.........................................................................73
7.2 Using the xcxclus Utility to Monitor Nodes....................................................................................73
7.3 Plotting the Data from the xcxclus Datafiles...................................................................................76
7.4 Using the xcxperf Utility to Display Node Performance................................................................77
7.5 Plotting the Node Performance Data..............................................................................................79
7.6 Running Performance Health Tests.................................................................................................80
8 Tuning Applications.....................................................................................................85
8.1 Using the Intel Trace Collector and Intel Trace Analyzer...............................................................85
8.1.1 Building a Program — Intel Trace Collector and HP-MPI......................................................85
8.1.2 Running a Program Intel Trace Collector and HP-MPI.......................................................86
8.2 The Intel Trace Collector and Analyzer with HP-MPI on HP XC...................................................87
8.2.1 Installation Kit.........................................................................................................................87
8.2.2 HP-MPI and the Intel Trace Collector.....................................................................................87
8.3 Visualizing Data Intel Trace Analyzer and HP-MPI....................................................................89
9 Using SLURM................................................................................................................91
9.1 Introduction to SLURM...................................................................................................................91
9.2 SLURM Utilities...............................................................................................................................91
9.3 Launching Jobs with the srun Command.......................................................................................91
9.3.1 The srun Roles and Modes......................................................................................................92
9.3.1.1 The srun Roles.................................................................................................................92
9.3.1.2 The srun Modes...............................................................................................................92
9.3.2 Using the srun Command with HP-MPI................................................................................92
Table of Contents 5