Introduction to NonStop Operations Management

Application Management
Introduction to NonStop Operations Management125507
11-3
Requirements
Events and operator messages. The application should use the Event Management
Service (EMS) to format events and messages in a standard fashion. Make sure that
the application provides the information you need. For example:
Make sure that an event or message is generated whenever a problem occurs.
For example, an event or message should be generated when there is a modem
problem, an application problem, a network problem, a tape problem, and so on.
Messages should contain enough information so that operators can readily
identify the application and component within the application that is causing the
problem. For example, if a server process in a Pathway application is causing a
problem, operator messages should identify both the application and the server
process that is generating the messages.
You might want to create separate console environments for system and
application messages. This can help correlate the causes of problems when they
occur. For example, a communication line going down might generate only one
critical message, whereas the application might generate tens of them. This
arrangement makes it easier to understand the cause-and-effect relationship of
problems.
If you use tokenized events, you might want to require standard tokens and
standard placement of the tokens.
For operator messages, you might want to require standard message formats.
Determine whether you need message numbers and standard text such as
“WARNING, “ERROR,” or “INFORMATIONAL.” Determine the type of
information your staff needs to solve problems. For example, should the
message list terminal names, user IDs, or system names?
You might want to create EMS event filters to select only events that are critical
or that require action. The filter could search messages for words such as
ERROR, ABENDING, ABORT, and EXCEPTION, and highlight those
messages only.
Make sure that all events and messages are documented. Documentation should
explain the cause of the event or message, the effect, and the required recovery
action.
TSM EMS Event Viewer and NonStop Virtual Hometerm Subsystem (VHS)
usage for system monitoring and event collection. If your organization uses VHS,
make sure the application allows you to use VHS for system monitoring and event
collection. The TSM EMS Event Viewer can display all messages formatted by
EMS. VHS can receive home terminal messages.
Problem escalation procedures. Make sure that your internal application-
development group or the application vendor will provide support when problems
occur. Establish procedures for problem escalation.
Security. Make sure that the application is designed so that the operations staff can
enforce your security policy, audit the application, and perform security
administration tasks.