Managing HP Serviceguard for Linux Ninth Edition, April 2009

Service Restarts
You can allow a service to restart locally following a failure. To do this, you indicate a
number of restarts for each service in the package control script. When a service starts,
the variable service_restart is set in the service’s environment. The service, as it executes,
can examine this variable to see whether it has been restarted after a failure, and if so,
it can take appropriate action such as cleanup.
Network Communication Failure
An important element in the cluster is the health of the network itself. As it continuously
monitors the cluster, each node listens for heartbeat messages from the other nodes
confirming that all nodes are able to communicate with each other. If a node does not
hear these messages within the configured amount of time, a node timeout occurs,
resulting in a cluster re-formation and later, if there are still no heartbeat messages
received, a reboot. See “What Happens when a Node Times Out” (page 88)
Responses to Failures 91