FORTRAN Reference Manual

Fault-Tolerant Programming
FORTRAN Reference Manual528615-001
16-10
Checkpointing $RECEIVE
Managing the Disk File
The new server primary might write to the disk up to five records that were already
written by the former primary. Because the server specified SYNCDEPTH=5 when it
opened the disk file, the disk process saves the replies to up to five write requests
since the last stack checkpoint. If the disk process receives a request to which it has
already responded, it returns the saved reply and does not do the actual I/O a second
time.
Supporting NonStop Requester Processes
If a fault-tolerant requester fails, its backup (now the new primary) can reissue the last
request. Because the servers function (adding employee records) is not retryable, the
server must be prepared to recognize a duplicate request and resend the reply, rather
than redoing the operation. It accomplishes this by saving replies to messages it
receives from the requester. The server must save at least the number of reply
messages that the requester specified as its SYNCDEPTH parameter when it opened
the server. If a FORTRAN run-time library routine receives a duplicate request, it
returns the same reply message that it returned the first time it replied to the message.
If a server is running as a NonStop process, it must checkpoint the saved replies to its
backup process as well.
Checkpointing $RECEIVE
Because $RECEIVE is a dynamic file, naming $RECEIVE in a CHECKPOINT
statement signifies nothing in itself; it does, however, signal the FORTRAN run-time
support system that the next write operation via $RECEIVE will be a reply to a
nonretryable request, which must be saved for use in case of primary requester failure.
Checkpointing Large Amounts of Data
The maximum size of a checkpoint message is 32,500 bytes. The amount of user data
checkpointed in a checkpoint message is less than 32,500 bytes because the message
includes header and control information added by the system. If your application needs
to checkpoint more data than can fit in one checkpoint message, you must checkpoint
the data by executing multiple CHECKPOINT statements.
If you execute more than one CHECKPOINT statement to checkpoint your data to the
backup process, you must not establish a takeover point (by specifying STACK='YES')
until you have sent all the data to the backup process. Otherwise, the data in the
backup process might be inconsistent when a takeover occurs—that is, some of the
data in the backup might be from a previous takeover point, and other data might be
the data that you have just sent to the backup process.