Managing HP Serviceguard A.11.20.20 for Linux, March 2014

ManualsBrandsHP ManualsSoftwareHP Serviceguard for Linux Cluster

261

262

263

264

265

266

267

268

269

270

specified in the package control script appear in the ifconfig output under the inet addr:

in the ethX:Y block, use cmmodnet to remove them:

cmmodnet -r -i <ip-address> <subnet>

where <ip-address> is the address indicated above and <subnet> is the result of masking

the <ip-address> with the mask found in the same line as the inet address in the

ifconfig output.

3. Ensure that package volume groups are deactivated. First unmount any package logical

volumes which are being used for file systems. This is determined by inspecting the output

resulting from running the command df -l. If any package logical volumes, as specified by

the LV[] array variables in the package control script, appear under the “Filesystem” column,

use umount to unmount them:

fuser -ku <logical-volume>

umount <logical-volume>

Next, deactivate the package volume groups. These are specified by the VG[] array entries

in the package control script.

vgchange -a n <volume-group>

4. Finally, re-enable the package for switching.

cmmodpkg -e <package-name>

If after cleaning up the node on which the timeout occurred it is desirable to have that node

as an alternate for running the package, remember to re-enable the package to run on the

node:

cmmodpkg -e -n <node-name> <package-name>

The default Serviceguard control scripts are designed to take the straightforward steps needed to

get an application running or stopped. If the package administrator specifies a time limit within

which these steps need to occur and that limit is subsequently exceeded for any reason, Serviceguard

takes the conservative approach that the control script logic must either be hung or defective in

some way. At that point the control script cannot be trusted to perform cleanup actions correctly,

thus the script is terminated and the package administrator is given the opportunity to assess what

cleanup steps must be taken.

If you want the package to switch automatically in the event of a control script timeout, set the

node_fail_fast_enabled parameter (page 176) to YES. In this case, Serviceguard will cause

a reboot on the node where the control script timed out. This effectively cleans up any side effects

of the package’s run or halt attempt. In this case the package will be automatically restarted on

any available alternate node for which it is configured.

8.8.6 Package Movement Errors (Legacy Packages)

These errors are similar to the system administration errors except they are caused specifically by

errors in the package control script. The best way to prevent these errors is to test your package

control script before putting your high availability application on line.

Adding a set -x statement in the second line of your control script will give you details on where

your script may be failing.

Package startup failure due to uncleaned LVM2 hosttags

When LVM2 hosttags feature is used in Volumegroup, Serviceguard ensures that the hosttags are

cleaned up on every package halt process. However, in case of Node power failure or crash

initiated by SERVICE_FAIL_FAST / NODE_FAIL_FAST feature, hosttags will not be cleaned up. In

such cases, hosttags have to be manually cleaned up before starting the package on other node.

Following messages can be seen in the package log where package failed to startup on other

node and it also provides procedure to clean up the hosttags.

Feb 11 17:18:36 root@abc.hp.com volume_group.sh[1871]: ERROR: Function activation_check:

Feb 11 17:18:36 root@abc.hp.com volume_group.sh[1871]: Error vg01 may still be activated on xyz.hp.com

264 Troubleshooting Your Cluster