Managing HP Serviceguard for Linux, Eighth Edition, March 2008

Designing Highly Available Cluster Applications
Automating Application Operation
Appendix B336
Automating Application Operation
Can the application be started and stopped automatically or does it
require operator intervention?
This section describes how to automate application operations to avoid
the need for user intervention. One of the first rules of high availability
is to avoid manual intervention. If it takes a user at a terminal, console
or GUI interface to enter commands to bring up a subsystem, the user
becomes a key part of the system. It may take hours before a user can get
to a system console to do the work necessary. The hardware in question
may be located in a far-off area where no trained users are available, the
systems may be located in a secure datacenter, or in off hours someone
may have to connect via modem.
There are two principles to keep in mind for automating application
relocation:
Insulate users from outages.
Applications must have defined startup and shutdown procedures.
You need to be aware of what happens currently when the system your
application is running on is rebooted, and whether changes need to be
made in the application's response for high availability.
Insulate Users from Outages
Wherever possible, insulate your end users from outages. Issues include
the following:
Do not require user intervention to reconnect when a connection is
lost due to a failed server.
Where possible, warn users of slight delays due to a failover in
progress.
Minimize the reentry of data.
Engineer the system for reserve capacity to minimize the
performance degradation experienced by users.