Introduction to NonStop Operations Management

Production Management
Introduction to NonStop Operations Management125507
5-15
Recovery Procedures
Back up and restore complete disks. Make sure that the on-site and off-site archives
are current.
Reload application files (might be required more or less frequently, depending on
the applications).
Perform system preventive maintenance. (Your Tandem support representative might
perform this task, depending on your support contract.)
Review weekly performance reports. Determine if additional system capacity will be
needed.
Review monitoring strategies. Make sure that problems are found before they
become serious, that reports are generated when required, and that applications are
running properly.
Review problem-reporting and problem-tracking procedures and make sure that
problems are reported and resolved in a timely manner.
Review system security. Make sure that security procedures are preventing system
penetration and that users and the operations staff are abiding by your organization’s
security policy.
Review staff training requirements.
Recovery Procedures
System problems are never routine. However, recovering from system problems can be
routine if your staff knows how to recover from as many problem situations as possible.
Some of the problems your staff should know how to resolve are:
Terminal-related problems
Looping processes
Application failures
Security problems
Disk failures
Communications failures
Processor failures
Power failures
System failures
Air-conditioning failures
Site disaster recovery
If your staff performs the routine daily tasks, they will detect many of these problems
before the problems become serious.
Developing recovery procedures for problems that might occur helps your staff prepare
for potential difficulties. When establishing recovery procedures, consider these
guidelines:
Review the guidelines provided in Section 6, “Problem Management.Problem-
reporting and problem-tracking procedures help you and your staff learn from past
problems, avoid repeat problems, and recognize future problems quickly.