Managing HP Serviceguard for Linux, Sixth Edition, August 2006

Troubleshooting Your Cluster
Monitoring Hardware
Chapter 8 267
Monitoring Hardware
Good standard practice in handling a high availability system includes
careful fault monitoring so as to prevent failures if possible or at least to
react to them swiftly when they occur. Disks can be monitored using the
Disk Monitor daemon, which is described in Chapter 5. In addition, the
following should be monitored for errors or warnings of all kinds:
•CPUs
Memory
•LAN cards
Power sources
All cables
Disk interface cards
Some monitoring can be done through simple physical inspection, but for
the most comprehensive monitoring, you should examine the system log
file (/var/log/messages) periodically for reports on all configured HA
devices. The presence of errors relating to a device will show the need for
maintenance.