Guardian Programmer's Guide

Table Of Contents
Writing a Requester Program
Guardian Programmer’s Guide 421922-014
21 - 8
File Sync Block Checkpoints (Example)
Nowait-depth to either 0 or 1, depending on your choice of waited or no-waited I/O.
Sync-depth to 1. Sync-depth can be set to 0, but this requires the requester to
handle path retries.
If the requester is a process pair, it is important to ensure that any I/O that is resent on
a takeover is using the same sync ID as the original request. Requester process pairs
might also use sync depth values greater than 1 to optimize checkpointing.
I/O Synchronization in Server
On the server side, there are no automatic recovery mechanisms for path failures.
Servers are responsible for keeping track of their requests, except in COBOL85
programs. COBOL85 has Guardian-specified extensions that allow it to effectively
handle openers and I/O synchronization.
This problem can be avoided by writing context free servers and using the TMF
transactions for retries.
Path failures normally cause requesters to resend pending requests. In servers, these
resends must be detected as duplicate requests. The following are typical path failure
scenarios that servers must handle:
If the server is a process pair, it must handle duplicate requests whenever it
switches processing to its backup process.
If a requester (opener) is a process pair, the server might receive a duplicate
request at any time, because the requester backup process took over.
If any of the requesters are in other systems, messages might be resent because
of communication failures between systems.
The information needed to track openers and requests is found in the OPEN system
messages and in data returned by FILE_GETRECEIVEINFO_ calls. The data is
normally collected into an open control block for each opener.
The server needs to manage each opener separately and save responses for up to the
greater of the open’s nowait depth and sync depth values, in order for it to be fault
tolerant. Note that a process pair normally maintains two opens for any given file, one
from the primary process and one from the backup process. The primary process first
opens the file and then instructs the backup process to do the same through a call to
FILE_OPEN_CHKPT_. This is known as a paired or a logical open.
The server must call MONITORCPUS (and MONITORNET if requesters reside in other
systems) to detect failing CPUs. If a requester resides in a failing CPU, no close
message is received. Instead, when a “processor down” system message is received,
the server must check all open control blocks for requesters in that CPU and implicitly
close those opens.
In order to properly manage an opener, the server needs an open control block
containing the following information:
The requesters’ process name