COBOL Manual for TNS/E Programs (H06.08+, J06.03+)

ManualsBrandsHP ManualsServerHP Integrity NonStop J-Series

941

942

943

944

945

946

947

948

949

950

Checkpointing

When the primary process executes a CHECKPOINT statement, one of its fault-tolerant facility

routines formats a message containing the information to be checkpointed and sends it to the

backup process in the form of an interprocess message. A fault-tolerant facility routine in the backup

process receives and acts upon the message.

The two types of information you must usually checkpoint are data items and sync blocks.

Data Items

These are usually file record areas but can be any desired data items in the File Section, the

Working-Storage Section, or the Extended-Storage Section of the Data Division. You must checkpoint

any data items that are part of the program’s state—specifically the disk record that is about to be

written, the terminal or tape record that was just read, and any data that is necessary to resume

processing at the site of the checkpoint statement.

The reason for checkpointing data items is to give the backup process all the information it needs

to reexecute an I-O request if the primary process fails. Usually, you checkpoint a data item just

before writing the data to disk. You can also use data-item checkpointing to eliminate the need for

the backup process to reexecute an I-O request. An example of this is an entry received from a

terminal. You checkpoint the data item received from a terminal by a READ statement immediately

after executing the READ statement to minimize the possibility that the operator has to reenter data.

Sync Blocks

A sync block contains control information about the current state of a disk file (such as the current

value of the file pointers).

The purpose of checkpointing the sync block is twofold:

• To ensure that a write operation is not duplicated when a backup process takes over from its

primary process

• To pass the current file pointers’ values to the file system of the backup processor

When a process executes a checkpoint of a sync block, the operating environment passes the

information in the sync block to the file system of the backup processor. The reason for preventing

duplicate operations is illustrated in Figure 45. In Figure 45, a primary process completes a

sequential write operation (that is, append to end of file) successfully, but fails before a subsequent

checkpoint to its backup process. On the takeover from the primary process, the backup process

reexecutes the operations just completed by the primary process. If the write operation was

performed as requested, it duplicates the record, but at the new end-of-file location.

Figure 45 Duplication in Takeover

To prevent such duplicate write operations by the backup process, you must specify a nonzero

SYNCDEPTH parameter in the OPEN statement. This action allows the file system to record the

completion status of each input-output operation. If the backup process requests an operation

already completed by the primary process, the file system recognizes this condition. Then, instead

948 Fault-Tolerant Processes