Managing HP Serviceguard for Linux, Tenth Edition, September 2012

These are errors caused specifically by errors in the cluster configuration file and package
configuration scripts. Examples of these errors include:
Volume groups not defined on adoptive node.
Mount point does not exist on adoptive node.
Network errors on adoptive node (configuration errors).
User information not correct on adoptive node.
You can use the following commands to check the status of your disks:
df - to see if your package’s volume group is mounted.
vgdisplay -v - to see if all volumes are present.
strings /etc/lvmconf/*.conf - to ensure that the configuration is correct.
fdisk -v /dev/sdx - to display information about a disk.
Package Control Script Hangs or Failures
When a RUN_SCRIPT_TIMEOUT or HALT_SCRIPT_TIMEOUT value is set, and the
control script hangs, causing the timeout to be exceeded, Serviceguard kills the script
and marks the package “Halted.” Similarly, when a package control script fails,
Serviceguard kills the script and marks the package “Halted. In both cases, the following
also take place:
Control of the package will not be transferred.
The run or halt instructions may not run to completion.
Global switching will be disabled.
The current node will be disabled from running the package.
Following such a failure, since the control script is terminated, some of the package’s
resources may be left activated. Specifically:
Volume groups may be left active.
File systems may still be mounted.
IP addresses may still be installed.
Services may still be running.
In this kind of situation, Serviceguard will not restart the package without manual
intervention. You must clean up manually before restarting the package. Use the following
steps as guidelines:
1. Perform application specific cleanup. Any application specific actions the control
script might have taken should be undone to ensure successfully starting the package
302 Troubleshooting Your Cluster