HP XC System Software Administration Guide Version 3.2

ManualsBrandsHP ManualsSoftwareHP XC System 3.x Software

101

102

103

104

105

106

107

108

109

110

8 Monitoring the System with Nagios

The Nagios open source application has been customized and configured to monitor the HP XC

system and network health. This chapter introduces Nagios and discusses these modifications.

This chapter addresses the following topics:

• “Nagios Overview” (page 105)

• “Using the Nagios Web Interface” (page 107)

• “Adjusting the Nagios Configuration” (page 114)

• “Configuring Nagios on HP XC Systems” (page 120)

• “Nan Notification Aggregator and Delimiter” (page 126)

• “Nagios Report Generator Utility” (page 126)

For information on using Nagios to monitor the optional HP Modular Cooling System (MCS),

see Appendix D (page 311).

8.1 Nagios Overview

The HP XC System Software uses the Nagios open source application to gather and display

system statistics, such as processor load and disk usage. Nagios watches hosts and services and

alerts you when problems occur or are resolved. HP XC System Software integrates Nagios with

software packaged with the HP XC System Software including Supermon, SLURM, and LSF.

The design of the Nagios application incorporates the concept of a plug-in, that is, an independent

file that extends the Nagios application. This design allows the development of service checks,

which are use to examine system and network services.

Nagios, as provided with the HP XC System Software, is configured with system and network

service checks already in place for your system; they were automatically configured for each

node with the nagios nconfig script when the cluster_config utility was run.

The HP XC system automatically configures the Nagios environment based on the configuration

of the HP XC system. Autoconfiguration is based on the information in the HP XC configuration

and management database (CMDB). The configuration is updated as a result of changes to the

HP XC database.

Nagios obtains most of its data from the Supermon open source monitoring application, which

is integrated with the HP XC System Software.

The Nagios master can be configured for improved availability. When configured for improved

availability, the head node must have the management_server role but not the

management_hub role; the other node in the availability set must have the management_server

role and the management_hub role. By default, the head node acts as the Nagios master and

the other node in the availability set acts as a Nagios_monitor. If the head node fails, the

availability tool reconfigures the other node in the availability set to act as both the Nagios master

and a Nagios monitor.

You can find the complete documentation for Nagios on the Nagios web address:

www.nagios.org

Specific information on Nagios features are available on the following web address:

www.nagios.org/about/

Additional information on Nagios is commercially available. The following web address lists

documents that describe Nagios and its use for system and network administration:

http://www.nagios.org/propaganda/books/

“Messages Reported by Nagios” (page 249) describes troubleshooting information reported by

Nagios.

8.1 Nagios Overview 105