Managing HP Serviceguard A.11.20.20 for Linux, May 2013

If you enter a value greater than 60 seconds (60,000,000
microseconds), cmcheckconf and cmapplyconf will note
the fact, as confirmation that you intend to use a large value.
Minimum supported values:
3 seconds for a cluster with more than one heartbeat
subnet.
14 seconds for a cluster that has only one heartbeat
LAN
With the lowest supported value of 3 seconds, a failover
time of 4 to 5 seconds can be achieved.
NOTE: The failover estimates provided here apply to the
Serviceguard component of failover; that is, the package is
expected to be up and running on the adoptive node in this
time, but the application that the package runs may take
more time to start.
For most clusters that use a lock LUN, a minimum
MEMBER_TIMEOUT of 14 seconds is appropriate.
For most clusters that use a MEMBER_TIMEOUT value lower
than 14 seconds, a quorum server is more appropriate than
a lock LUN. The cluster will fail if the time it takes to acquire
the disk lock exceeds 0.2 times the MEMBER_TIMEOUT. This
means that if you use a disk-based quorum device (lock
LUN), you must be certain that the nodes in the cluster, the
connection to the disk, and the disk itself can respond
quickly enough to perform 10 disk writes within 0.2 times
the MEMBER_TIMEOUT.
Keep the following guidelines in mind when deciding how
to set the value.
Guidelines: You need to decide whether it's more important
for your installation to have fewer (but slower) cluster
re-formations, or faster (but possibly more frequent)
re-formations:
To ensure the fastest cluster re-formations, use the
minimum value applicable to your cluster. But keep in
mind that this setting will lead to a cluster re-formation,
and to the node being removed from the cluster and
rebooted, if a system hang or network load spike
prevents the node from sending a heartbeat signal
within the MEMBER_TIMEOUT value. More than one
node could be affected if, for example, a network event
such as a broadcast storm caused kernel interrupts to
be turned off on some or all nodes while the packets
are being processed, preventing the nodes from
sending and processing heartbeat messages.
See “Cluster Re-formations Caused by
MEMBER_TIMEOUT Being Set too Low” (page 258) for
troubleshooting information.
For fewer re-formations, use a setting in the range of
10 to 25 seconds (10,000,000 to 25,000,000
4.7 Cluster Configuration Planning 99