Availability Guide for Application Design
What Is Application Availability?
Availability Guide for Application Design—525637-004
1-9
Outage Classes
Design Outages
Design outages are usually caused by malfunctioning software, either system software 
or application. Again, deterministic faults are rare throughout the industry. Transient 
problems are more common as users of most personal computers will testify.
Potential causes of a design outage on a NonStop system include a LAN network 
broadcast storm or a degenerating response time. Good software engineering 
practices are the primary means for preventing deterministic design outages. Process 
pairs tolerate transient faults well; the backup process can continue where the primary 
failed because its memory, queues, and so on, are different than those of the failed 
primary. Refer to Section 7, Availability Through Process-Pairs and Monitors, for 
details.
The transaction model also tolerates faults well through atomicity, consistency, 
integrity, and durability. Refer to Section 4, Data Protection and Recovery, for details.
Operational Outages
Operational errors occur when the operator or support person does the wrong thing.
Examples of operational errors on a NonStop system include accidentally pushing the 
power off button, incorrectly installing the operating system, and pulling the good 
processor board when intending to replace the faulty one.
Training and automated problem handling provide the best protection against 
operational errors. Section 8, Instrumenting an Application for Availability, provides 
details on how to design an application to generate event messages when problems 
occur. The NonStop Distributed Systems Management (DSM) subsystem can be used 
to automate a response or present the information to an operator in a way that clarifies 
and simplifies the response procedures. Refer to the Availability Guide for Problem 
Management for details on how to handle event messages.
Environmental Outages
Environmental outages result from an external condition that has nothing to do with the 
design or operation of the computer installation.
Examples of environmental outages include major natural disasters such as 
earthquakes, electrical storms, and flooding, man-made disasters, or more mundane 
events such as power-grid problems. Note that in the U.S.A., which has one of the 
most reliable power services in the world, the average computer room experiences 443 
power faults per year. In other words, scarcely a day passes without the power quality 
being compromised.
Keeping a remote duplicate database enables fast recovery from natural or man-made 
disasters; Section 4, Data Protection and Recovery
, provides details. Some NonStop 
servers are designed to tolerate earthquakes up to a magnitude 8.2 on the Richter 
scale. 










