Using NFS as a file system type with HP Serviceguard A.11.20 on HP-UX and Linux

4
Configuring the cluster parameter CONFIGURED_IO_TIMEOUT_EXTENSION
In a Serviceguard cluster in which NFS-imported file systems are used, an unlikely but possible scenario exists in which
data corruption could occur. The scenario is as follows:
1. A Serviceguard package using an NFS file system (“NFSPkg”) is running on cluster node client-1
2. Node client-1 issues an NFS write request immediately before NFSPkg moves to another cluster node.
3. NFSPkg is started on the adoptive node client-2
4. Adoptive node client-2 begins sending NFS write requests to the same file and offset as the write request
previously sent by client-1 just before the package was moved.
5. If the original NFS write request from client-1 arrives on the NFS server after the new write requests from
client-2, the server would overwrite the data sent from client-2, thus resulting in data corruption.
To prevent this, you must determine a maximum delay between when a write is issued from any Serviceguard node and
when it can arrive at the NFS server. The following typical scenarios and illustrations could give you some guidance.
The NFS write may go through network switches before it reaches the NFS server. In each switch, the packet will be
dropped after some specific time has elapsed. The IEEE Bridge specification802.1D, refers to this value as MBTD.
Important: All switches and routers that are configured between the NFS server and Serviceguard nodes must support MBTD.
You can calculate the lifetime of an NFS client’s write packet by adding the MBTD value of all the switches and routers
that are configured between the NFS server and the Serviceguard nodes.
You must set the Serviceguard cluster parameter CONFIGURED_IO_TIMEOUT_EXTENSION for any cluster in which
packages use NFS mounts. See the section on cluster configuration parameters in the Managing Serviceguard Manual
for more information about CONFIGURED_IO_TIMEOUT_EXTENSION.
To set the value for the CONFIGURED_IO_TIMEOUT_EXTENSION, first determine MBTD for each switch and router. The
value should be in the vendorsdocumentation. Set the CONFIGURED_IO_TIMEOUT_EXTENSION to the sum of the values
for the switches and routers. If there is more than one possible path between the NFS server and the cluster nodes, add
the values for each path and use the largest number.
The CONFIGURED_IO_TIMEOUT_EXTENSION value will increase with the increase in routers and switches there are
between the NFS server and Serviceguard nodes. The cluster reformation time is increased by the
CONFIGURED_IO_TIMEOUT_EXTENSION, so keep this value as low as possible by appropriate routing between the NFS
server and Serviceguard nodes, or by using hardware that supports smaller MBTD values.
The CONFIGURED_IO_TIMEOUT_EXTENSION parameter must also be set in some cases in an extended-distance cluster
(EDC). See the discussion of this parameter in the Managing Serviceguard Manual for details. If packages use NFS
imports in an EDC, calculate the settings for each case separately (that is, the value required for the EDC configuration,
and the value required for NFS) and use the greater of the two values.
For example, if the EDC configuration requires CONFIGURED_IO_TIMEOUT_EXTENSION to be 1000000 (microseconds)
and the NFS configuration requires it to be 2000000, set it to 2000000.