Managing HP Serviceguard for Linux, Sixth Edition, August 2006

Troubleshooting Your Cluster

Monitoring Hardware

Chapter 8 267

Monitoring Hardware

Good standard practice in handling a high availability system includes

careful fault monitoring so as to prevent failures if possible or at least to

react to them swiftly when they occur. Disks can be monitored using the

Disk Monitor daemon, which is described in Chapter 5. In addition, the

following should be monitored for errors or warnings of all kinds:

•CPUs

• Memory

•LAN cards

• Power sources

• All cables

• Disk interface cards

Some monitoring can be done through simple physical inspection, but for

the most comprehensive monitoring, you should examine the system log

file (/var/log/messages) periodically for reports on all configured HA

devices. The presence of errors relating to a device will show the need for

maintenance.