Guardian Programmer's Guide

Table Of Contents
Introduction to Guardian Programming
Guardian Programmer’s Guide 421922-014
1 - 2
Application-Level Fault Tolerance
Application-Level Fault Tolerance
There are several ways in which an application can be designed to withstand
operating-system failures. Three such methods are introduced below:
Transaction protection using the NonStop Transaction Manager/MP (TM/MP)
Process pairs
Persistent processes
Any combination of these techniques could be appropriate for providing fault tolerance,
depending on the needs of the application.
The Transaction Approach to Fault Tolerance
Using TM/MP software, fault tolerance is achieved by grouping operations into
transactions. At the start or end of a transaction, your data is always in a consistent
state. If any kind of failure occurs during the transaction, then the transaction is
“backed out” by rolling back the data to the known consistent state at the start of the
transaction. The transaction can then be restarted using consistent data. See the
Introduction to NonStop Transaction Manager/MP (TM/MP) for details.
Process Pairs
You use function calls (in C programs) or Guardian procedure calls to provide fault
tolerance in your application by means of process pairs: a primary process performs
the application, while a secondary (backup) process in another CPU remains ready to
take over if the primary fails. The primary process uses checkpoints to copy selected
parts of its environment to the backup. Using this checkpointed information, the
backup process is able to take over from the primary without interrupting service to the
user of the application.
The process-pair technique can be used to protect data that cannot be considered part
of a transaction and therefore cannot be protected by the transaction mechanism; for
example, information that remains in memory and does not get written to disk.
Writing fault tolerant programs using the C language is described in Section 27,
Fault-Tolerant Programming in C.
Persistent Processes
Processes that only supply services to other processes but otherwise maintain no data
of their own need only to continue to execute. For such processes, it might be
appropriate simply to ensure that the process gets restarted whenever it stops. A
monitor process that periodically checks the process status can restart such a process.
Processes monitored in this way are sometimes called “persistent processes.”