Serviceguard NFS Toolkit A.11.11.06, A.11.23.05 and A.11.31.03 Administrator's Guide

rpc.statd, rpc.lockd, nfsd, rpc.mountd, rpc.pcnfsd, and nfs.flm processes. You can

monitor any or all of these processes as follows:

• To monitor the rpc.statd, rpc.lockd, and nfsd processes, you must set the NFS_SERVER

variable to 1 in the /etc/rc.config.d/nfsconf file. If one nfsd process dies or is killed,

the package fails over, even if other nfsd processes are running.

• To monitor the rpc.mountd process, you must set the START_MOUNTD variable to 1 in the

/etc/rc.config.d/nfsconf file. To monitor the rpc.mountd process, you must start

it when the system boots up, not by inetd.

• To monitor the rpc.pcnfsd process, you must set the PCNFS_SERVER variable to 1 in the

/etc/rc.config.d/nfsconf file.

• To monitor the nfs.flm process, you must enable the File Lock Migration feature. Monitor

this process with the ps command, not with the rpcinfo command. If you enable the File

Lock Migration feature, ensure that the monitor script name is unique for each package (for

example, nfs1.mon).

NOTE: The file name of the NFS_FLM_SCRIPT script must be limited to 13 characters or fewer.

NOTE: The nfs.mon script uses rpcinfo calls to check the status of various processes. If the

rpcbind process is not running, the rpcinfo calls time out after 75 seconds. Because 10 rpcinfo

calls are attempted before failover, it takes approximately 12 minutes to detect the failure. This

problem has been fixed in release versions 11.11.04 and 11.23.03.

The default NFS control script, hanfs.sh, does not invoke the monitor script. You do not have

to run the NFS monitor script to use Serviceguard NFS. If the NFS package configuration file

specifies AUTO_RUN YES and LOCAL_LAN_FAILOVER YES (the defaults), the package switches

to the next adoptive node or to a standby network interface in the event of a node or network

failure. However, if one of the NFS services goes down while the node and network remain up,

you need the NFS monitor script to detect the problem and to switch the package to an adoptive

node.

Whenever the monitor script detects an event, it logs the event. Each NFS package has its own

log file. This log file is named according to the NFS control script, nfs.cntl, by adding a .log

extension. For example, if your control script is called /etc/cmcluster/nfs/nfs1.cntl, the

log file is called /etc/cmcluster/nfs/nfs1.cntl.log.

TIP: You can specify the number of retry attempts for all these processes in the nfs.mon file.

On the Client Side

The client should NFS-mount a file system using the package name in the mount command. The

package name is associated with the package's relocatable IP address. On client systems, be sure

to use a hard mount and set the proper retry values for the mount. Alternatively, set the proper

timeout for automounter. The timeout should be greater than the total end-to-end recovery time

for the Serviceguard NFS package—that is, running fsck, mounting file systems, and exporting

file systems on the new node. (With journaled file systems, this time should be between one and

two minutes.) Setting the timeout to a value greater than the recovery time allows clients to

reconnect to the file system after it returns to the cluster on the new node.

NOTE: AutoFS mounts may fail when mounting file systems exported by an HA-NFS package

soon after that package has been restarted. To avoid these mount failures, AutoFS clients should

wait at least 60 seconds after an HA-NFS package has started before mounting file systems

exported from that package.

How the Control and Monitor Scripts Work 21