Serviceguard NFS Toolkit A.11.11.06, A.11.23.05 and A.11.31.03 Administrator's Guide

NOTE: If you enable the File Lock Migration feature, an NFS client (or group of clients)
may hit a corner case of requesting a file lock on the HA/NFS server and not receiving a
crash recovery notification message when the HA/NFS package migrates to an adoptive
node. This occurs only when the NFS client sends its initial lock request to the HA/NFS
server and then the HA/NFS package moves to an adoptive node before the FLM script
copies the /var/statmon/sm entry for this client to the package holding directory.
The probability of hitting this corner-case problem is not very high, because the SM file copy
interval is very short (by default, five seconds). The chances of an NFS client (or group of
NFS clients) sending its initial lock request (it must be the initial request, since this request
generates the /var/statmon/sm file) to the HA/NFS server and having the package migrate
within this same five seconds window are extremely unlikely.
If you repeatedly experience a problem with this corner-case scenario, reduce the copy time
interval by setting the PROPAGATE_INTERVAL parameter to a lower value.
Editing the NFS Monitor Script (nfs.mon)
The NFS monitor script, nfs.mon, contains NFS-specific monitor variables and functions. The
nfs.mon script is an optional component of HA/NFS. The hanfs.sh file specifies whether the
NFS monitor script is used. The following steps describe how to configure the NFS monitor
script:
1. To monitor the File Lock Migration script (nfs.flm), set the NFS_FILE_LOCK_MIGRATION
variable to 1, and set the NFS_FLM_SCRIPT name to match the hanfs.sh script value for
this variable:
NFS_FILE_LOCK_MIGRATION=1 NFS_FLM_SCRIPT="${0%/*}nfs1.flm"
NOTE: The file name of the NFS_FLM_SCRIPT script must be limited to 13 characters or
fewer.
NOTE: The nfs.mon script uses rpcinfo calls to check the status of various processes. If the
rpcbind process is not running, the rpcinfo calls time out after 75 seconds. Because 10 rpcinfo
calls are attempted before failover, it takes approximately 12 minutes to detect the failure.
This problem has been fixed in release version 11.11.04 and 11.23.03.
2. You can call the nfs.mon script with the following optional arguments:
Interval - the time (in seconds) between the attempts for checking if NFS processes are
up and running. The default is 10 seconds.
Lockd Retry - the number of attempts to ping rpc.lockd before exiting. The default
is 4 attempts.
Retry - the number of attempts to ping the rpc.statd, rpc.mountd, nfsd,
rpc.pcnfsd, and nfs.flm processes before exiting. The default is 4 attempts.
Portmap Retry - the number of attempts to ping the rpcbind process before exiting.
The default is 4 attempts.
These arguments are passed using the NFS_SERVICE_CMD line in the hanfs.sh file. In
order to set these optional arguments, all of the preceding arguments must also be specified
in the NFS_SERVICE_CMD line.
32 Installing and Configuring Serviceguard NFS