Managing HP Serviceguard for Linux Ninth Edition, April 2009

is to test your package control script before putting your high availability application
on line.
Adding a set -x statement in the second line of your control script will give you
details on where your script may be failing.
Node and Network Failures
These failures cause Serviceguard to transfer control of a package to another node. This
is the normal action of Serviceguard, but you have to be able to recognize when a
transfer has taken place and decide to leave the cluster in its current condition or to
restore it to its original condition.
Possible node failures can be caused by the following conditions:
reboot
Kernel Oops
Hangs
Power failures
You can use the following commands to check the status of your network and subnets:
ifconfig - to display LAN status and check to see if the package IP is stacked
on the LAN card.
arp -a - to check the arp tables.
Since your cluster is unique, there are no cookbook solutions to all possible problems.
But if you apply these checks and commands and work your way through the log files,
you will be successful in identifying and solving problems.
Troubleshooting the Quorum Server
NOTE: See the HP Serviceguard Quorum Server Version A.04.00 Release Notes for
information about configuring the Quorum Server. Do not proceed without reading
the Release Notes for your version.
Authorization File Problems
The following kind of message in a Serviceguard node’s syslog file or in the output
of cmviewcl -v may indicate an authorization problem:
Access denied to quorum server 192.6.7.4
The reason may be that you have not updated the authorization file. Verify that the
node is included in the file, and try using /usr/lbin/qs -update to re-read the
quorum server authorization file.
286 Troubleshooting Your Cluster