Guardian Programmer's Guide

Table Of Contents
Fault-Tolerant Programming in C
Guardian Programmer’s Guide 421922-014
27 - 10
Backup Process Organization
Backup Process Organization
A process executing as the backup process proceeds as follows:
Message-Processing Loop
At the beginning of its execution, the backup process does the following:
1. Opens $RECEIVE so that it can receive messages from:
The operating system, indicating that the primary process or primary CPU has
failed.
The primary process, containing current state information.
2. Calls MONITORCPUS to inform the system that the backup process is to be
notified if the primary CPU fails.
Note that if the primary process fails (rather than the CPU), the backup process is
automatically notified; the backup does not need to request such notification.
3. Enters a message-processing loop in which it reads messages from $RECEIVE
and takes appropriate action depending on the type of message:
If the message contains open state information for files opened by the primary
process, the backup process calls __ns_backup_open to perform a backup
open of the files opened by the primary process. The backup open allows files
to be open concurrently by the primary and backup processes. After calling
__ns_backup_open, the backup process continues executing the message-
processing loop.
Note that the stderr, stdin, and stdout files are automatically backup
opened and do not require an explicit open.
If the message contains current state information, the backup process takes
appropriate action to update its memory with the state information. For file
state information, the backup process calls __ns_fset_file_state. For
control and application state, processing is application-dependent.
The backup process then continues executing the message-processing loop.
If the message indicates that the primary process or CPU has failed, the
backup process takes over execution. The new primary process then exits the
message-processing loop and begins the initialization phase.
If the message is of an unexpected type, that is, if the message is neither state
information from the primary process nor an indication of process or CPU
failure from the system, the backup process sends an appropriate reply. For
example, such a situation occurs in the following scenario:
1. A primary server process fails.
2. The backup process takes over.