Managing Serviceguard NFS for Linux, September 2006

Serviceguard NFS for LINUX Introduction

How the Control and Monitor Scripts Work

Chapter 116

NOTE To configure NFS for maximal availability, you must do the following:

• To have the NFS package start automatically when the cluster starts

up, and to start on an adoptive node after a failure, you need to

specify AUTO_RUN=YES in the package configuration file.

• Also, the default NFS control script does not invoke the NFS

monitoring script, nfs.mon. To invoke this script (see Chapter 3,

“Sample Configurations,”) to trigger a failover if one of the package’s

NFS services goes down while the node and network remain up.

Whenever the monitor script detects an event, it logs information to a file

using the same name as your NFS control script adding a .log

extension. Each NFS package has its own log file. For example, if your

control script is called pkg1.cntl, the package log file is called

pkg1.cntl.log. The NFS monitor log file, which is on the same

directory as the NFS control script, is always called hanfs.sh.log

Remote mount table synchronization

With NFS toolkit, a remote mount table synchronization binary code is

installed in /usr/bin/sync_rmtab. This program is provided for

synchronizing the client current mount table, /var/lib/nfs/rmtab, in

the case of a NFS package failover. This synchronization process ensures

NFS clients access NFS seamlessly in the case of the NFS package

failover. The NFS control script, hanfs.sh, calls the synchronization

program when the remote mount table needs to be synchronized.

On the Client Side

The client should NFS-mount a file system using the package name in

the mount command. The package name is associated with the package’s

relocatable IP address. On client systems, be sure to use a hard mount

and set the proper retry values for the mount. Alternatively, set the

proper timeout for auto-mounter. The timeout should be greater than the

total end-to-end recovery time for the Serviceguard NFS package—that

is, running fsck, mounting file systems, and exporting file systems on

the new node. Setting the timeout to a value greater than the recovery

time allows clients to reconnect to the file system after it returns to the

cluster on the new node.