Availability Guide for Application Design
Instrumenting an Application for Availability
Availability Guide for Application Design—525637-004
8-21
The Subsystem Programmatic Interface (SPI)
critical situations and lets the operator (with the help of programmatic tools)
make the final determination.
•
A version identifier for the application
Different versions of an application have different problem histories. Knowing the
version of the application can be key in analyzing a problem and in determining a
solution.
•
An event number or event type identifier
The event number indicates the category of the event specifying what happened.
Similar events should have the same event number. Significantly different events
should have different event numbers. For example, all instances of an automated
teller running out of currency would typically have the same event number.
•
The component of the application (known as the event subject) affected by the
event
For example, this item might indicate which specific automated teller is empty.
•
A timestamp indicating when the message was generated
The value of the timestamp is automatically generated by EMS.
•
Where appropriate, the recommended recovery action
Event message design is important for automated operations because event
messages might be the only source of information when an unexpected condition
occurs in an application. Each event message must be in the proper format to be used
by the appropriate command. Refer to Command Messages on page 8-23 for how
information in the event message maps to information in a corresponding command
message.
Standard Event Messages
Although you can design EMS event messages yourself, for most purposes, you can
use standard event messages. These messages are preformatted and save you time
in development. For information about standard event message formats, refer to the
EMS Manual section on standard event messages.
Examples of Event Message Contents
Generally, your application should report anything that might affect how your
application is managed. However, you must be selective to avoid overwhelming
management applications and operations staff with messages that are of little help.
Listed below are some typical situations for which a highly available application might
generate events:
•
An object has changed state.
In addition to providing immediate notification that an object has gone offline, this
kind of information is important in problem diagnosis. Problem diagnostic