Serviceguard NFS Toolkit A.11.11.06, A.11.23.05 and A.11.31.03 Administrator's Guide

NOTE: If you enable the File Lock Migration feature, an NFS client (or group of clients)

may hit a corner case of requesting a file lock on the HA/NFS server and not receiving a

crash recovery notification message when the HA/NFS package migrates to an adoptive

node. This occurs only when the NFS client sends its initial lock request to the HA/NFS

server and then the HA/NFS package moves to an adoptive node before the FLM script

copies the /var/statmon/sm entry for this client to the package holding directory.

The probability of hitting this corner-case problem is not very high, because the SM file copy

interval is very short (by default, five seconds). The chances of an NFS client (or group of

NFS clients) sending its initial lock request (it must be the initial request, since this request

generates the /var/statmon/sm file) to the HA/NFS server and having the package migrate

within this same five seconds window are extremely unlikely.

If you repeatedly experience a problem with this corner-case scenario, reduce the copy time

interval by setting the PROPAGATE_INTERVAL parameter to a lower value.

Editing the NFS Monitor Script (nfs.mon)

The NFS monitor script, nfs.mon, contains NFS-specific monitor variables and functions. The

nfs.mon script is an optional component of HA/NFS. The hanfs.sh file specifies whether the

NFS monitor script is used. The following steps describe how to configure the NFS monitor

script:

1. To monitor the File Lock Migration script (nfs.flm), set the NFS_FILE_LOCK_MIGRATION

variable to 1, and set the NFS_FLM_SCRIPT name to match the hanfs.sh script value for

this variable:

NFS_FILE_LOCK_MIGRATION=1 NFS_FLM_SCRIPT="${0%/*}nfs1.flm"

NOTE: The file name of the NFS_FLM_SCRIPT script must be limited to 13 characters or

fewer.

NOTE: The nfs.mon script uses rpcinfo calls to check the status of various processes. If the

rpcbind process is not running, the rpcinfo calls time out after 75 seconds. Because 10 rpcinfo

calls are attempted before failover, it takes approximately 12 minutes to detect the failure.

This problem has been fixed in release version 11.11.04 and 11.23.03.

2. You can call the nfs.mon script with the following optional arguments:

• Interval - the time (in seconds) between the attempts for checking if NFS processes are

up and running. The default is 10 seconds.

• Lockd Retry - the number of attempts to ping rpc.lockd before exiting. The default

is 4 attempts.

• Retry - the number of attempts to ping the rpc.statd, rpc.mountd, nfsd,

rpc.pcnfsd, and nfs.flm processes before exiting. The default is 4 attempts.

• Portmap Retry - the number of attempts to ping the rpcbind process before exiting.

The default is 4 attempts.

These arguments are passed using the NFS_SERVICE_CMD line in the hanfs.sh file. In

order to set these optional arguments, all of the preceding arguments must also be specified

in the NFS_SERVICE_CMD line.

32 Installing and Configuring Serviceguard NFS