Serviceguard NFS Toolkit A.11.11.06, A.11.23.05 and A.11.31.03 Administrator's Guide

rpc.statd, rpc.lockd, nfsd, rpc.mountd, rpc.pcnfsd, and nfs.flm processes. You can
monitor any or all of these processes as follows:
To monitor the rpc.statd, rpc.lockd, and nfsd processes, you must set the NFS_SERVER
variable to 1 in the /etc/rc.config.d/nfsconf file. If one nfsd process dies or is killed,
the package fails over, even if other nfsd processes are running.
To monitor the rpc.mountd process, you must set the START_MOUNTD variable to 1 in the
/etc/rc.config.d/nfsconf file. To monitor the rpc.mountd process, you must start
it when the system boots up, not by inetd.
To monitor the rpc.pcnfsd process, you must set the PCNFS_SERVER variable to 1 in the
/etc/rc.config.d/nfsconf file.
To monitor the nfs.flm process, you must enable the File Lock Migration feature. Monitor
this process with the ps command, not with the rpcinfo command. If you enable the File
Lock Migration feature, ensure that the monitor script name is unique for each package (for
example, nfs1.mon).
NOTE: The file name of the NFS_FLM_SCRIPT script must be limited to 13 characters or fewer.
NOTE: The nfs.mon script uses rpcinfo calls to check the status of various processes. If the
rpcbind process is not running, the rpcinfo calls time out after 75 seconds. Because 10 rpcinfo
calls are attempted before failover, it takes approximately 12 minutes to detect the failure. This
problem has been fixed in release versions 11.11.04 and 11.23.03.
The default NFS control script, hanfs.sh, does not invoke the monitor script. You do not have
to run the NFS monitor script to use Serviceguard NFS. If the NFS package configuration file
specifies AUTO_RUN YES and LOCAL_LAN_FAILOVER YES (the defaults), the package switches
to the next adoptive node or to a standby network interface in the event of a node or network
failure. However, if one of the NFS services goes down while the node and network remain up,
you need the NFS monitor script to detect the problem and to switch the package to an adoptive
node.
Whenever the monitor script detects an event, it logs the event. Each NFS package has its own
log file. This log file is named according to the NFS control script, nfs.cntl, by adding a .log
extension. For example, if your control script is called /etc/cmcluster/nfs/nfs1.cntl, the
log file is called /etc/cmcluster/nfs/nfs1.cntl.log.
TIP: You can specify the number of retry attempts for all these processes in the nfs.mon file.
On the Client Side
The client should NFS-mount a file system using the package name in the mount command. The
package name is associated with the package's relocatable IP address. On client systems, be sure
to use a hard mount and set the proper retry values for the mount. Alternatively, set the proper
timeout for automounter. The timeout should be greater than the total end-to-end recovery time
for the Serviceguard NFS package—that is, running fsck, mounting file systems, and exporting
file systems on the new node. (With journaled file systems, this time should be between one and
two minutes.) Setting the timeout to a value greater than the recovery time allows clients to
reconnect to the file system after it returns to the cluster on the new node.
NOTE: AutoFS mounts may fail when mounting file systems exported by an HA-NFS package
soon after that package has been restarted. To avoid these mount failures, AutoFS clients should
wait at least 60 seconds after an HA-NFS package has started before mounting file systems
exported from that package.
How the Control and Monitor Scripts Work 21