HP XC System Software Administration Guide Version 3.2

[n47] Power Unit Power Redundancy Redundancy Lost
7
A date and time stamp indicating when the cause for the alert happened.
8
How long the message waited in the nand queue, that is, how much time elapsed before
this message was mailed.
9
The nand sequence number. The nand daemon receives and batches messages generated
by Nagios and sends them by e-mail.
8.4.4 System Event Log Monitoring
This section explains the system event log and describe configuration details.
8.4.4.1 Understanding the System Event Log
Each HP hardware platform supplies an event logging mechanism to capture platform-specific
events to track hardware state and changes. Information in the system event log varies, but it
typically contains information including, but not limited to, the following:
Memory ECC errors
Power supply failures
Voltage problems
Event logs are stored by the firmware and can become full over time. Some platforms require
regular maintenance to clear the logs to avoid losing critical events. In addition, errors that
indicate failure or pending failure of a component need to be brought to the operator's immediate
attention.
The HP XC system event log functionality provides complete management of all log types of
supported HP platforms. Log information is regularly read, archived, and used to generate
Nagios alerts when applicable. Logs that approach a critical size are cleared to prevent loss of
event data.
Event logs are typically accessed through the management port. They require platform- and
protocol-specific user authentication as well as network access to the console port (cp-nxxx,
where nxxx is the node number). System event log history is captured in
/hptc_cluster/adm/logs/sel/sel-nxxx.log, where nxxx represents the name of the
individual node. Logs are managed by the standard logrotate functionality. For more
information on this facility, see logrotate(8).
8.4.4.2 System Event Log Configuration
The system event log and hardware sensor information is gathered for HP XC systems. Some
platforms require additional user name and password setup to allow access to the connection
on the console port. In addition, depending on how the head node console connection is attached
to the network, additional setup may be required.
8.4.4.3 Additional Optional Configuration
You can change the rotation of the system event logs and the rules for Nagios alerts.
You can reconfigure the rotation of the node event logs with the logrotate command.
Nagios creates alerts for power, memory, voltage, and ASR (Automatic System Recovery)
messages. The rules for alerts are defined in the /opt/hptc/nagios/etc/selRules file. You
can modify these rules by editing this file as follows:
Add rules to this file for new alerts.
Change alerts by modifying the corresponding rule in this file.
Remove a rule to delete the corresponding alert.
8.4 Configuring Nagios on HP XC Systems 125