COBOL Manual for TNS/E Programs (H06.08+, J06.03+)
Figure 44 illustrates the activity of a process pair. The backup process stays in monitor state while
the primary process is operating. If the primary fails, the backup leaves the monitor state and
begins executing at the point indicated by the last call to CHECKPOINT by the primary.
Figure 44 Activity of a Process Pair
This sequence of actions occurs when a process pair runs:
1. The primary process opens any files required for its execution.
2. The primary process starts its backup process in another processor module by executing a
STARTBACKUP verb.
This action also opens the files for the backup process and checkpoints the state of the primary
process to the backup process. A process pair opens files in a manner that permits both
members of the pair to have a file open while retaining the ability to exclude other processes
from accessing a file. When a disk file has been opened by a process pair in this manner, a
record or file lock by the primary process is also an equivalent lock by the backup process.
3. The backup process, at the beginning of its execution, automatically begins monitoring the
primary process. This is the extent to which the backup process executes unless a failure of
the primary process occurs.
4. The primary process begins executing its main processing loop. At critical points through the
execution loop, typically before each write to a disk file, the primary process executes a
CHECKPOINT statement to copy part of its environment and pertinent file control information
to the backup process (marking a restart point for the backup process). Typically, a program
contains several CHECKPOINT statements, each of which checkpoints only a portion of the
primary process’s environment.
5. If the primary process fails, the backup process begins executing at the restart point indicated
by the latest execution of a CHECKPOINT statement. The backup process is then considered
to be the primary process.
6. If the reason the primary process failed was a processor failure (that is, the backup process
received a processor-down message), the fault-tolerant facility in the new primary (former
backup) process automatically starts a new backup process when the failed processor has
been repaired and brought back on line. This new backup process is then ready to take over
if the primary process fails.
Fault-Tolerant Facility 947










