Availability Guide for Problem Management

Recovering From Unplanned Outages
Availability Guide for Problem Management125509
3-8
Step 2Gathering Facts and Reporting the Problem
Step 2—Gathering Facts and Reporting the
Problem
When a problem is detected, relevant facts need to be collected, and appropriate
personnel must be notified. Consider establishing procedures for reporting problems.
Established procedures can help you track:
Each problem that occurs
How the problem was resolved
Who resolved the problem and when
Recurring problems
How long it took to resolve the problem
Whether a problem can be prevented or recovery procedures for that problem can be
automated
Problem tracking also provides a way to determine which problems are the most
damaging and should have the highest priority for prevention techniques.
Two different logs—a problem-reporting log and an outage log—can help you track and
resolve problems that can turn into unplanned outages. Though some of the information
gathered for the two logs is similar, their purposes are very different.
You can use the CA-Unicenter for Tandem Problem Management function to automate
help desk activities by defining, assigning ownership of, tracking, and recording the
resolution of system and end-user problems.
Gathering the Facts
Collect as much information as you can about the problem and the circumstances in
which the problem occurred.
Facts About the Problem
What? What is the nature of the problem? What specifically is wrong?
Where? Where was the problem first noticed? Where since? Which applications,
components, devices, and users are affected? If the problem involves a
device, what is the device name and location?
When? When did the problem occur? What is the frequency of the problem?
Has the problem occurred before?
Magnitude? What is the magnitude of the problem? Is it quantifiable in any way?
(That is, can it be measured?) For example, how many users are
affected? Is the problem getting worse?