Availability Guide for Problem Management

Preventing Unplanned Outages
Availability Guide for Problem Management125509
2-7
Well-Trained Staff
Make sure that recovery procedures documentation is easily accessible to operations
and support staff.
Make sure that full copies of all manuals, including all relevant application user
manuals, are available in the operations area, either online or in hard-copy format.
Make sure that backup tapes are available and readable. Standardize external
labeling, and ensure that files can be restored.
Assign responsibility for carrying out recovery procedures to the appropriate people.
Ensure that a terminal and telephone are situated next to the system so an operator
can discuss the problem on the phone while accessing information.
Well-Trained Staff
A well-trained operations and support staff is absolutely essential to achieving high
availability in your environment. An inadequately trained operations staff is one of the
biggest vulnerabilities an operations group can face. A well-trained staff is better able to
respond to and resolve problems. You should conduct surprise drills to ensure that your
staff can perform the recovery procedures as documented.
Tandem Education
The Tandem Education Group offers lecture-based courses, customized courses,
independent study programs, computer-based training, and videotapes. Lecture-based
courses are offered at Tandem training centers throughout the world. Customized
courses are offered on demand. These courses range from an introduction to Tandem
concepts and facilities to specialized courses on Tandem software products and
operations. Tandem courses include, but are not limited to the following:
Guardian System Utilities
Basic Tandem Operator Tasks
Problem Solving for Tandem Operators
System Management
Security Concepts and Planning
Guardian Principles
For more information on Tandem Education courses and training programs, ask your
Tandem representative for a copy of the Tandem Education Course Catalog. This
catalog contains a complete list of courses, training programs, and training centers. It
also contains diagrams showing training paths for a variety of Tandem users, including
network managers, programmers, database administrators, systems and operations
management, and technical specialists.
Well-Designed Applications
Ensure that your system and applications take advantage of quick startup and shutdown
techniques. The Availability Guide for Change Management provides operational
strategies for reducing startup and shutdown time. Take advantage of Tandems fault-
tolerant design to ensure that your hardware and applications are less vulnerable to
problems that may occur. See Section 7, “Auditing Systems for Fault Tolerance.