Managing HP Serviceguard for Linux, Seventh Edition, July 2007

Designing Highly Available Cluster Applications

Appendix B 319

B Designing Highly Available

Cluster Applications

This appendix describes how to create or port applications for high

availability, with emphasis on the following topics:

• Automating Application Operation

• Controlling the Speed of Application Failover

• Designing Applications to Run on Multiple Systems

• Restoring Client Connections

• Handling Application Failures

• Minimizing Planned Downtime

Designing for high availability means reducing the amount of unplanned

and planned downtime that users will experience. Unplanned downtime

includes unscheduled events such as power outages, system failures,

network failures, disk crashes, or application failures. Planned downtime

includes scheduled events such as scheduled backups, system upgrades

to new OS revisions, or hardware replacements.

Two key strategies should be kept in mind:

1. Design the application to handle a system reboot or panic. If you are

modifying an existing application for a highly available environment,

determine what happens currently with the application after a

system panic. In a highly available environment there should be

defined (and scripted) procedures for restarting the application.

Procedures for starting and stopping the application should be

automatic, with no user intervention required.

2. The application should not use any system-specific information such

as the following if such use would prevent it from failing over to

another system and running properly:

• The application should not refer to uname() or gethostname().

• The application should not refer to the SPU ID.

• The application should not refer to the MAC (link-level) address.