Availability Guide for Application Design
Availability Through Process-Pairs and Monitors
Availability Guide for Application Design—525637-004
7-7
Sending Process-State Information to the Passive
Backup
•
Critical data. This data is application dependent but might include data just read
from a terminal, data about to be written to disk, or data maintained in processor
memory in the primary.
•
File synchronization information to make file system input or output retryable.
Section 2, Overview of Server and Network Fault Tolerance, discusses how the
disk process uses file synchronization blocks to make partial sector writes retryable
in the event of primary disk process failure. To guard against primary application
process failure, however, the primary process must also send synchronization
information to its own backup.
This information can be checkpointed to the backup at, potentially, many points in the
program. Checkpointing, however, can consume a significant amount of system
resources; it is up to the application designer to determine what to checkpoint and
when.
Checkpointing must be done in one instance. Using multiple checkpoint operations to
perform a single checkpoint leaves the process pair vulnerable to failures between the
operations.
The following operations checkpoint control-state or data-state information:
•
File open checkpoints
•
Restart checkpoints
•
Nonrestart checkpoints
File-Open Checkpoints
The FILE_OPEN_CHKPT_ procedure simply sends the status of files open in the
primary process to the backup process. The primary process typically does this shortly
after starting and opening the backup process and after opening files.
File-open checkpoints work as follows. When you open a file in the primary process,
you get an access control block for it that contains status information for the file. By
checkpointing the open information, you send this status information to the backup
process so that it can open the same file with identical status.
Such operations are necessary whenever the backup process is created: after first
starting up the process, after takeover, and after restarting a failed backup.
Restart Checkpoints
A restart checkpoint copies information from the primary process’s data stack along
with file synchronization information. By default, the entire data stack gets
checkpointed. You do have the option, however, to specify how far down the stack you
want to checkpoint. Because the stack marker for the checkpointing procedure is also
checkpointed, this operation specifies the restart point for the backup process.
The CHECKPOINT, CHECKPOINTMANY, CHECKPOINTX, and
CHECKPOINTMANYX system procedures allow you to send this information to the
backup process.