Managing HP Serviceguard for Linux, Eighth Edition, March 2008

Troubleshooting Your Cluster
Monitoring Hardware
Chapter 8302
Monitoring Hardware
Good standard practice in handling a high availability system includes
careful fault monitoring so as to prevent failures if possible or at least to
react to them swiftly when they occur. For information about disk
monitoring, see “Creating a Disk Monitor Configuration” on page 239. In
addition, the following should be monitored for errors or warnings of all
kinds:
•CPUs
Memory
LAN cards
Power sources
All cables
Disk interface cards
Some monitoring can be done through simple physical inspection, but for
the most comprehensive monitoring, you should examine the system log
file (/var/log/messages) periodically for reports on all configured HA
devices. The presence of errors relating to a device will show the need for
maintenance.