Serviceguard NFS Toolkit for Linux Version A.03.00 Release Notes, March 2009

NOTE: toolkit.sh, hanfs.sh, nfs.mon, nfs.flm and hanfs.conf should be in the

same directory. In case of legacy packages even the pkg.cntl script must be in the same directory.

Do not rename the following files since their names are hard coded in the control scripts:

toolkit.sh, hanfs.sh, hanfs.conf, nfs.flm, nfs.mon, tkit_module.sh, tkit_gen.sh,

lockmigration.sh, nfs.1 and it’s softlink nfs.

Known Problems and Workarounds

The following describes known problems with the NFS Toolkit and workarounds for them.

However, this is subject to change without notice. For the most current information contact your

HP support representative.

More recent information on known problems and workarounds may be available on the Hewlett

Packard IT Resource Center:

http://itrc.hp.com (Americas and Asia Pacific)

http://europe.itrc.hp.com (Europe)

JAGaf57739: HA NFS and ‘Stale NFS Handle’

What is the problem?

Serviceguard relies on LVM (Logical Volume Manager) to manage its shared storage, which

contain data and files of applications managed by Serviceguard. Just as with other applications,

an NFS instance managed by Serviceguard means that the specific instance is “active” on one

node at a time, with all resources available to that node only (including volume groups configured

as a resource for the NFS package).

If the package goes down on node 1, the resources are released so that the second node can

“claim” the resources. The package is brought up, and the instance is now “active” on node 2.

NFS clients continue to connect to the server, unaware that the server has migrated from one

node to another.

The behavior has changed in Linux kernel 2.6 such that the kernel uses lvm2 with the device

mapper to virtualize devices instead of using the actual physical names.

In this implementation, the actual device node for a logical volume is dynamically created upon

volume group activation, meaning a volume group that starts out on node 1 but is failed over

to node 2 can very easily end up with a different minor number after fail over. This will result

in clients who connected before the failover getting a “stale NFS handle”.

The volume groups are created with dynamic minor numbers. This causes NFS problems (‘stale

nfs handle’) on the client when the NFS package is migrated to another node.

What is the workaround?

There are two ways to address this issue:

• Create the logical volumes with persistent minor numbers.

• Export the filesystem with an assigned filesystem identification.

NOTE: Refer to the README File for more detailed information.

JAGag06739: Serviceguard/LX NFS server returns ESTALE when package is brought

down

What is the problem?

When the NFS package is brought down, the client process gets ESTALE from the server while

it should not. There is a delay when the interface is brought down and that package could still

reach the NFS server layer.

10 Serviceguard NFS for Linux Version A.03.00 Release Notes