Availability Guide for Application Design
Availability Guide for Application Design—525637-004
8-1
8
Instrumenting an Application for
Availability
This section provides an overview of application error handling and instrumentation to
encourage the application designer to integrate instrumentation into the initial design of
the application. Written from the viewpoint of application development, this section
emphasizes instrumentation of the business application itself. It also discusses some
of the services, applications, and utilities that provide problem management support
and performance management support in order to understand how the information is
used and how the interfaces should be written.
Once your application goes online, the burden of keeping it online is often carried by
the operations staff. The operations staff must do what they can to prevent the
application from going down and, if it does go down, get the application back online
quickly.
The role of the designer and developer is to establish the criteria that show the health
of the application and how to communicate that information to the human and
automated operators and support analysts.
This section discusses what application designers and developers can do to help the
operations staff by providing appropriate instrumentation in the application. The
application should provide:
•
Information to the operator about changes in status of application objects
•
A command-and-response interface that the operator can use to monitor and
control the application in an appropriate way.
In other words, you should provide the same kinds of functions in the business
application that HP provides for control of many subsystems.
If you use standard interfaces for reporting application events and for receiving
commands, you can write applications that automatically perform the majority of
operations tasks. This reduces the burden on the operations staff and provides a
quicker response to state changes in the application, which increases availability.
This section provides:
•
A design philosophy about when to recover without operator intervention and when
to instrument for operator intervention; refer to Design Philosophy for Error
Handling on page 8-2.
•
An explanation of what instrumentation is and a framework for planning your
instrumentation to provide availability; refer to What Is Instrumentation and Why Is
It Necessary? on page 8-7.