Managing HP Serviceguard for Linux, Eighth Edition, March 2008

Understanding Hardware Configurations for Serviceguard for Linux
Redundancy of Cluster Components
Chapter 226
Redundancy of Cluster Components
In order to provide a high level of availability, a typical cluster uses
redundant system components, for example two or more SPUs and two or
more independent disks. Redundancy eliminates single points of failure.
In general, the more redundancy, the greater your access to applications,
data, and supportive services in the event of a failure. In addition to
hardware redundancy, you need software support to enable and control
the transfer of your applications to another SPU or network after a
failure. Serviceguard provides this support as follows:
In the case of LAN failure, the Linux bonding facility provides a
standby LAN, or Serviceguard moves packages to another node.
In the case of SPU failure, your application is transferred from a
failed SPU to a functioning SPU automatically and in a minimal
amount of time.
For software failures, an application can be restarted on the same
node or another node with minimum disruption.
Serviceguard also gives you the advantage of easily transferring control
of your application to another SPU in order to bring the original SPU
down for system administration, maintenance, or version upgrades.
The current maximum number of nodes supported in a Serviceguard
Linux cluster is 16 depending on configuration. SCSI disk arrays can be
connected to a maximum of four nodes at a time on a shared bus; a
FibreChannel connection lets you employ up to 16 nodes. HP-supported
disk arrays can be simultaneously connected to multiple nodes.
The guidelines for package failover depend on the type of disk technology
in the cluster. For example, a package that accesses data on a SCSI disk
array can fail over to a maximum of four nodes. A package that accesses
data from a disk in a cluster using FibreChannel disk technology can be
configured for failover among 16 nodes.
A package that does not access data from a disk on a shared bus can be
configured to fail over to as many nodes as you have configured in the
cluster, (currently a maximum of 16), regardless of disk technology. For
instance, if a package only runs local executables, it can be configured to
fail over to all nodes in the cluster that have local copies of those
executables, regardless of the type of disk connectivity.