Availability Guide for Problem Management

Monitoring Event Messages
Availability Guide for Problem Management125509
4-2
What Are System and Application Event Messages?
What Are System and Application Event Messages?
Event messages are a special subset of Subsystem Programmatic Interface (SPI)
messages. Like all SPI messages, the information in event messages is contained in
tokens but, unlike many SPI messages, event messages take advantage of formatting
templates that convert the tokenized information into local-language text. So, the
messages are both easy for applications to process and easy for operators to read. The
messages convey information about significant changes in the system, subsystem, or
application environment. Occurrences and conditions reported by event messages
include:
Changes in the subsystem environment
Errors encountered during continuous operation
Conditions that might lead to a problem if not corrected
Conditions that require operator intervention
Significant losses of function or resources
Conditions that cause a process to terminate
Event messages are reported by Tandem subsystems and can also be generated by user-
written applications. Messages are subsystem- or application-specific, and there are
hundreds of different event messages produced by Tandem subsystems alone. While this
wide range of messages provides the diversity and depth of information required to
manage systems and networks, it also creates the need for tools to manage the event
messages themselves.
Managing System Event Messages
Because of the typically large volume of event messages, a plan must be implemented to
effectively manage system event messages and thus prevent unplanned outages. An
effective message management strategy should include the ability to:
Suppress unnecessary messages
Automate message response
Standardize messages with consistent appearance and management context
Recognize and respond to subsystem messages
Managing system event messages can help you achieve the following operations
management goals:
Improve the quality of end-user services, system performance, and application
availability.
Reduce the complexity of system management tasks by improving system visibility
and control, and by reducing the time it takes to detect and correct problems.
Lower the cost of ownership by improving the productivity of your operators and
system-support personnel.