Availability Guide for Problem Management

About This Manual
Availability Guide for Problem Management125509
xiv
What Is in This Manual?
What Is in This Manual?
This manual is organized into nine sections, as follows:
Section 1, “Introduction to Problem Management,” defines problem management
and explains how it relates to the OM framework and online management.
Section 2, “Preventing Unplanned Outages,” describes common causes of unplanned
outages and explains how to predict, prevent, and prepare for them.
Section 3, “Recovering From Unplanned Outages,” describes how to quickly bring
the system or application back online after an unplanned outage.
Section 4, “Monitoring Event Messages,describes how to predict, prevent, and
detect problems by effectively managing system and application messages.
Section 5, “Monitoring Objects,describes how to predict, prevent, and detect
problems by monitoring important objects.
Section 6, “Automating Operations and Recovery Procedures,” describes how to
predict, prevent, detect, and quickly recover from problems by automating
operations and recovery procedures.
Section 7,Auditing Systems for Fault Tolerance,describes how to prevent
unplanned outages by identifying and eliminating any potential problems that can
affect system and application availability.
Section 8, “Planning for Disasters,” describes how to prevent, prepare for, and
recover from a disaster.
Section 9, “Problem Management Tools,” describes the tools provided by Tandem to
detect, analyze, and recover from problems and to administer the problem
environment.
Where to Find More Information
The following manuals contain information that may be of interest to readers of this
manual:
Introduction to NonStop Operations Management
This manual introduces managers to NonStop system operations. It provides
guidelines, suggestions, and ideas on operations and support areas, and operations
documentation. It also describes how to automate and centralize operations, and
how to improve operations management processes. This manual is a prerequisite for
reading other Tandem operations manuals.