Managing HP Serviceguard for Linux, Seventh Edition, July 2007

Designing Highly Available Cluster Applications
Appendix B 319
B Designing Highly Available
Cluster Applications
This appendix describes how to create or port applications for high
availability, with emphasis on the following topics:
Automating Application Operation
Controlling the Speed of Application Failover
Designing Applications to Run on Multiple Systems
Restoring Client Connections
Handling Application Failures
Minimizing Planned Downtime
Designing for high availability means reducing the amount of unplanned
and planned downtime that users will experience. Unplanned downtime
includes unscheduled events such as power outages, system failures,
network failures, disk crashes, or application failures. Planned downtime
includes scheduled events such as scheduled backups, system upgrades
to new OS revisions, or hardware replacements.
Two key strategies should be kept in mind:
1. Design the application to handle a system reboot or panic. If you are
modifying an existing application for a highly available environment,
determine what happens currently with the application after a
system panic. In a highly available environment there should be
defined (and scripted) procedures for restarting the application.
Procedures for starting and stopping the application should be
automatic, with no user intervention required.
2. The application should not use any system-specific information such
as the following if such use would prevent it from failing over to
another system and running properly:
The application should not refer to uname() or gethostname().
The application should not refer to the SPU ID.
The application should not refer to the MAC (link-level) address.