FORTRAN Reference Manual

Fault-Tolerant Programming
FORTRAN Reference Manual528615-001
16-7
Checkpointing File Buffers
The location of checkpoint statements in your program depends on the requirements of
your application. As a general rule, you include a CHECKPOINT statement just before
a WRITE statement to a disk file or to $RECEIVE; a server process might execute a
checkpoint statement after reading from $RECEIVE. It is essential to checkpoint before
any nonretryable I/O operation, but which operations can be repeated and which
cannot depends on the application’s task and the program logic.
A retryable operation is one that can be repeated any number of times and yield the
same result each time; read operations generally fall into this category. A nonretryable
operation is one that cannot be trusted to yield the same result if it is repeated; write,
rewrite, and delete operations generally fall into this category.
Checkpointing File Buffers
The primary purpose of checkpointing file data buffers is to give the backup process all
the information it needs to reexecute an I/O request if the primary fails. Usually, data
buffer checkpointing occurs just before the data is written.
You can also use data buffer checkpointing to eliminate the need for the backup
process to reexecute an I/O request. Terminal input is an example of this: The data is
checkpointed on receipt to reduce the chance of the operator’s having to reenter it.
Checkpointing File Status Information
When a CHECKPOINT statement specifies a unit or file number, the system passes
the current status of the file to the file system that is running in the backup process’s
processor. If you specify a SYNCDEPTH greater than zero when you open the file, the
file’s status includes a system-assigned unique identification number for each I/O you
execute for the file. If your primary process fails and the backup process—which
begins executing instructions at the previous takeover point—re-executes an already
completed I/O operation, the receiving process of your I/O request does not reexecute
the request but, instead, returns the same reply that it has saved from the original I/O
request.
The following code is typical of a server process. The server reads a requester’s
message from $RECEIVE, executes one or more I/O operations to a file, and returns a
reply to the requester. The server might contain statements such as the following: