Availability Guide for Problem Management
Availability Guide for Problem Management–125509
xiii
About This Manual
The Availability Guide for Problem Management explains how to maximize system and 
application availability by preventing problems from becoming unplanned outages. This 
manual:
•
Defines problem management and explains how it relates to the operations 
management (OM) framework and online management
•
Describes the causes of unplanned outages and explains how to predict, prevent, and 
prepare for them
•
Shows how to quickly bring a system or application back online after an unplanned 
outage by implementing efficient problem-resolution techniques
•
Describes how to predict, prevent, and detect problems by effectively managing 
system and application messages and by monitoring important objects
•
Describes how to predict, prevent, detect, and quickly recover from problems by 
automating operations and recovery procedures
•
Lists and describes the tools provided by Tandem to detect, analyze, and recover 
from problems, and to administer the problem environment
Who Should Read This Manual?
Anyone—or any group—responsible for managing Tandem systems should read this 
manual. The following table identifies some typical readers and the kinds of 
information they can find in this manual. 
This manual assumes that the reader has worked with NonStop systems before and is 
familiar with operations management.
These kinds of readers… Look for this information about…
Operations management
•
Choosing Tandem products for problem 
management
•
Understanding how various products fit together
Operations and support personnel
•
Diagnosing and solving (or escalating) problems
•
Monitoring systems
•
Logging problems
•
Measuring and analyzing system performance
•
Automating operations and recovery procedures
•
Setting policies for problem escalation and disaster 
recovery
•
Ensuring that all personnel are adequately trained










