Availability Guide for Problem Management

Automating Operations and Recovery Procedures
Availability Guide for Problem Management125509
6-5
Starting Batch Jobs
Starting Batch Jobs
Use the NetBatch product to automatically schedule jobs, such as those that summarize
or post results at the end of the day. NetBatch allows you to run jobs or job steps
anywhere in an Expand network, which means you can automate and consolidate
reporting for widely distributed applications. You can also control schedulers on
different nodes of the Expand network for a central site, reducing operations demands at
remote locations.
Avoiding Problems of Automated Recovery
Automation often works in the background. Sometimes the lack of visibility of the
automated recovery process poses potential problems. For example, you might try to
bring down and restart a terminal or a communication line without success, only to find
out later that the automated operator has worked well and recovered from the “failure”
efficiently. To address this lack of visibility, it might be necessary to make your
automated recovery rules more visible.
Making Recovery Rules More Visible
To make your automated recovery rules more visible, you can specify that each time a
rule is executed, the automated operator will generate an event to inform operators of the
outcome of the recovery procedure. Implementing these recovery events can help you to
understand the effects of automation in your system environment and can also contribute
to the collection of statistics on the efficiency of your automated operations.
Using Tandem Tools for Automation
Tandem provides the following tools to help you automate your operations and recovery
tasks:
CA-Unicenter for Tandem provides an Event Management function, which you can
configure to respond to specific events automatically, and a Workload Management
function, which you can configure to automatically schedule jobs.
NetBatch helps you automatically schedule routine tasks, such as those that
summarize or post results at the end of the day.
Tandem Failure Data System (TFDS) isolates software problems and provides
automatic processor failure data collection, diagnosis, and recovery services.
Automating Job Scheduling and Event Response With CA-Unicenter
The CA-Unicenter for Tandem Workload Management function provides automated
submission and tracking of batch processes. It facilitates the definition and enforcement
of complex interrelationships between work units, including predecessor controls,
manual tasks, and cause-and-effect scheduling policies.
The CA-Unicenter Event Management facilities allow you to establish lists of actions
that are automatically performed in response to specific events.