TS/MP 2.5 Pathsend and Server Programming Manual

ACS Subsystem Limit Errors
In some cases, you can recover from an ACS Subsystem limit error by retrying the Pathsend
procedure call. Whether a retry will work depends on the design and operating environment of
your application, including the configuration of static and dynamic links. Static links between ACS
subsystem processes and a PATHMON process generally persist for some time, depending on the
application and the system workload. Dynamic links and server-classes come and go more
frequently, again depending on the application and the system workload. The number of concurrent
server-class send operations is very dynamic.
Therefore, it might be appropriate to retry a call to SERVERCLASS_SEND_,
SERVERCLASS_DIALOG_BEGIN_, or SERVERCLASS_DIALOG_SEND_ (after a short wait) if the
concurrent calls limit is exceeded. Conversely, it might not be appropriate to retry a call if the limit
for the maximum number of PATHMON processes is exceeded.
For more information about PATHMON configuration and performance, see the NonStop TS/MP
2.5 System Management Manual.
ACS Restart Errors
At times, the ROUT process on a processor might go down because of external or internal reasons.
The ACS subsystem tries to restart the failed core processes including the ROUT process. Therefore,
the Pathsend requestor must be coded as explained below.
When the ROUT process is restarted and a Pathsend requestor performs a context-free waited send
operation, the communication channel with the earlier ROUT process is closed. A new
communication channel is established with the new ROUT process and the send operation is
completed. If the ROUT process does not restart, the waited send operation fails, and the send
operation must be retried.
If a Pathsend requestor successfully initiates a nowaited context-free send operation, and the ROUT
process is restarted in that processor before the nowaited send operation completes, the
corresponding AWAITIOX procedure (or similar call) returns an error. The file number returned by
the AWAITIOX procedure is the same as the scsend-op-num returned when the send operation
is initiated. A subsequent FILEINFO procedure (or similar call) indicates file-system error 201
(FEPATHDOWN). The Pathsend requestor must complete (using AWAITIOX or similar calls) or cancel
(using CANCEL or CANCELREQ) all outstanding nowaited sends before any further nowait sends
can be initiated. If there are any nowait sends outstanding, any further nowaited sends will fail.
Because some of the nowaited sends might have already been replied to by the respective server
processes but not yet completed by the requestor, it is recommended that the application must
complete the outstanding nowaited sends rather than cancel them. In such a case, the corresponding
AWAITIOX procedure will be successful and will return the reply from the server process. If the
CANCEL or CANCELREQ procedure is used, the operation is canceled and the reply will be lost.
After completion or cancellation of the outstanding nowait sends, a new nowait send can be
initiated. In the meantime, if the ROUT process is restarted, the communication channel with the
earlier ROUT process is closed and a new communication channel is established. Subsequently,
the current and new nowait sends will be successful. In the meantime, if the ROUT process is not
restarted, the communication channel with the earlier ROUT process will be closed. However, a
new communication channel is not established, and the send operation fails with Pathsend error
947 (FESCLINKMONCONNECT) and file-system error 14 (FENOSUCHDEV).
The following examples describe the scenarios on ACS restart.
The application uses only waited context-free sends.
After its first send operation (send1), the ACS subsystem goes down in a processor and is
automatically restarted before the next send operation (send2). The send1 operation
successfully completes without any cause. When the send2 operation is initiated, TS/MP
detects a problem with the existing communication channel to the ROUT process. It closes the
old faulty channel with the ROUT process and tries to reopen it. Because the ROUT process
Basic Pathsend Programming 53