Managing HP Serviceguard for Linux Ninth Edition, April 2009

There are two principles to keep in mind for automating application relocation:
Insulate users from outages.
Applications must have defined startup and shutdown procedures.
You need to be aware of what happens currently when the system your application is
running on is rebooted, and whether changes need to be made in the application's
response for high availability.
Insulate Users from Outages
Wherever possible, insulate your end users from outages. Issues include the following:
Do not require user intervention to reconnect when a connection is lost due to a
failed server.
Where possible, warn users of slight delays due to a failover in progress.
Minimize the reentry of data.
Engineer the system for reserve capacity to minimize the performance degradation
experienced by users.
Define Application Startup and Shutdown
Applications must be restartable without manual intervention. If the application requires
a switch to be flipped on a piece of hardware, then automated restart is impossible.
Procedures for application startup, shutdown and monitoring must be created so that
the HA software can perform these functions automatically.
To ensure automated response, there should be defined procedures for starting up the
application and stopping the application. In Serviceguard these procedures are placed
in the package control script. These procedures must check for errors and return status
to the HA control software. The startup and shutdown should be command-line driven
and not interactive unless all of the answers can be predetermined and scripted.
In an HA failover environment, HA software restarts the application on a surviving
system in the cluster that has the necessary resources, such as access to the necessary
disk drives. The application must be restartable in two aspects:
It must be able to restart and recover on the backup system (or on the same system
if the application restart option is chosen).
It must be able to restart if it fails during the startup and the cause of the failure
is resolved.
Application administrators need to learn to startup and shutdown applications using
the appropriate HA commands. Inadvertently shutting down the application directly
will initiate an unwanted failover. Application administrators also need to be careful
that they don't accidently shut down a production instance of an application rather
than a test instance in a development environment.
A mechanism to monitor whether the application is active is necessary so that the HA
software knows when the application has failed. This may be as simple as a script that
issues the command ps -ef | grep xxx for all the processes belonging to the
application.
290 Designing Highly Available Cluster Applications