HP XC System Software Installation Guide Version 4.0

Table 2-1 HP XC Software Stack (continued)
DescriptionSoftware Product Name
HP XC System Software provides the installation, configuration,
administration, and management tools to support HP XC systems on HP
Cluster Platforms 3000, 3000BL, 4000, 4000BL and 6000.
HP XC System Software Version 4.0
HPC Linux from HP provides Linux ABI (Application Binary Interface)
compatibility, which provides:
The ability to run binary serial codes from compatible Linux systems
Access to community-developed software and access to a large
application catalog
HPC Linux for High Performance
Computing (HPC Linux)
LVS provides a system alias that enables user logins to be distributed
across multiple login nodes and single system sign-on for both users and
administrators.
Linux Virtual Server (LVS)
LSF, the high performance computing version of LSF from Platform
Computing Inc, has been integrated with SLURM in response to the
growing need for a lightweight, powerful workload management system
that is scalable and can support parallel, compute-intensive workloads
across computing resources.
LSF with SLURM contains the same queuing and scheduling management
as standard LSF, but it is integrated with SLURM to gather information
and manage the compute resources. This integration allows users to make
use of SLURM's simple commands to perform a variety of parallel tasks
within their LSF batch scripts. SLURM also provides administration
personnel a small set of powerful tools to manage the resources of an HP
XC system.
LSF with SLURM
MySQL is a third-party application that creates and modifies the HP XC
configuration and management database (CMDB).
MySQL
Nagios is a system and network monitoring application. It watches hosts
and services that you specify and alerts you when problems occur or are
resolved. On an HP XC system, Nagios is integrated with Supermon for
monitoring capabilities.
Nagios
The pdsh shell is a multithreaded remote shell that executes commands
on multiple remote hosts in parallel.
Parallel Distributed Shell (pdsh)
SLURM was developed by Lawrence Livermore National Laboratory and
Linux Networks. SLURM is a resource manager for Linux clusters. It
manages the key resource on an HP XC system: the compute nodes.
SLURM
Standard LSF is the industry standard LSF product developed by Platform
Computing Inc is used for workload management across clusters of
compute resources. It features comprehensive workload management
policies in addition to simple first-come, first-serve scheduling (fairshare,
preemption, backfill, advance reservation, service-level agreement, and
so on). Standard LSF is suited for jobs that do not have complex parallel
computational needs and is ideal for processing large volumes of serial,
single-process jobs.
When you install standard LSF, you are prompted to install the standard
LSF extensions. These extensions include MPI and mpich features.
For more information about where to obtain Platform Computing Inc LSF
documentation, see “Supplementary Software Products” (page 18).
Standard LSF
Supermon is a highly scalable, high-speed cluster monitoring system.
Supermon provides all required node statistics to the Nagios subsystem.
System statistics are tiered, aggregated, and stored in the configuration
and management database.
Supermon
36 Installing Software on the Head Node