Guardian Programmer's Guide

Table Of Contents
Introduction to Guardian Programming
Guardian Programmer’s Guide 421922-014
1 - 3
Mirrored Disks
Mirrored Disks
One effective protection against loss of data is the use of mirrored disk volumes.
Mirrored disk volumes maintain copies of data on two physically independent disk
drives that are accessed as a single device and managed by the same I/O process. All
data written to one disk is also written to the other disk. All data read from one disk
could be read from the other, because the data is identical. A mirrored volume protects
data against single-disk failures: if one disk drive fails, the other remains operational.
The odds against both disk drives failing at the same time are high when you always
have a disk drive repaired or replaced promptly if it fails.
After a disk is replaced or a drive is repaired, all data is copied back onto it when the
operator issues a Peripheral Utility Program (PUP) REVIVE command (on D-series
releases) or Subsystem Control Facility (SCF) REVIVE command (on G-series
releases). Processing continues while the revive operation takes place. Mirrored
operation resumes as the transfer of data begins.
Multiple Copies of the Operating System
Each CPU has its own copy of the operating system. If a failure of one processing
module should occur, then each other processing module has its own operating system
copy to allow it to continue. Moreover, a failure in the operating system is confined to
the CPU in which the failure occurs, without affecting the other processing modules.
System Integrity
Concurrent with application program execution, the operating system continually
checks the integrity of the system. Each CPU transmits “I’m alive” messages to all
other CPUs at a predefined interval (approximately once per second). Following this
transmission, each CPU checks for the receipt of an “I’m alive” message from each of
the other CPUs.
In addition to sending “I’m alive” messages to other CPUs, each CPU periodically tests
its ability to send and receive messages by sending messages to itself on both buses.
Unless it regularly receives messages from itself on at least one bus, it halts to ensure
that it will not interfere with the correct operation of other CPUs.
If the operating system in one process module fails to receive “I’m alive” messages
from another process module then it responds as follows. The operating system
groups the CPUs that are able to send messages to themselves and others. CPUs
show that they are operational by joining the group; any modules that do not join the
group within a short period of time are declared nonoperational.
System Services
You can access the services supported by the operating system in two ways:
By making calls from an application program to Guardian procedures