Clustering Linux Servers with the Concurrent Deployment of HP Serviceguard Linux and Red Hat Global File System for RHEL5, October 2008

Cluster Configuration System

The Cluster Configuration System (CCS) manages the cluster configuration and provides

configuration information to other cluster components in a Red Hat Cluster. CCS daemon runs in

each cluster node and makes sure that the cluster configuration file in each cluster is up to date.

When the cluster.conf is modified by the operator, the local CCS daemon broadcasts the new

cluster.conf file and CCS daemon on other cluster nodes replace their copy with the new one.

Also, when CCS daemon starts up, it broadcasts its cluster.conf to see if it needs to be replaced

by a newer version in use on other nodes. The cluster configuration file (/etc/cluster/cluster.conf)

is an XML file that describes the cluster characteristics and is stored locally on all nodes in the

cluster.

SAN failure

Loss of the storage system makes data unavailable and causes most services to failover to an

alternate node. Therefore, all storage systems must have redundant controllers and power

supplies. Multiple paths to shared storage are required so that the loss of storage path

connectivity does not require a failover between nodes but instead causes a failover to the

redundant path on the same node. If a node has only a single path to shared storage, then any

failure in that path may cause all packages relying on that shared storage to failover to another

node in the cluster.

Since, Red Hat Cluster does not fence a node when there is loss of access to storage, another

mechanism must be used to force a package failover when Serviceguard is running. In a

recommended Serviceguard configuration, there is a dual path to storage so two failures are

necessary to lose access to storage. For customers that want their systems to be able to survive

that dual failure, the disk monitor can be used in all the packages on a dual-cluster. This ensures

that when there is a node with failed FC link(s), the packages on this node will be moved to other

adoptive nodes. However, if Serviceguard attempts to move a package back to such a failed

node before the failure is fixed, it will fail again and move on to other adoptive nodes. For more

information, refer to the manual Using High Availability Monitors available at

http://docs.hp.com.

After the FC links are restored, the failed node must be rebooted, in order to, restore GFS mount

points.

Cluster management in a Dual Cluster

Automatic cluster startup at node startup

Red Hat cluster can be configured to startup at boot time by enabling services “cman”, “clvmd”

and “gfs” through chkconfig command. By enabling these the services, a node will attempt to join

the existing Red Hat cluster at boot time, and mount the GFS file system. Similarly in Serviceguard

cluster, by setting the AUTOSTART_CMCLD to 1, “cmcluster” service will startup on a node at

boot up. By starting the “cmcluster” service, a node will automatically join the existing

Serviceguard cluster at boot time.

At Serviceguard package startup time, the package control script verifies if the GFS file systems is

mounted, and if not, mounts it before starting the application. To have GFS file system mounted,

the Red Hat cluster services should be up on that node. Hence it is recommended to have Red Hat

cluster services, cman, clvmd and gfs started at node startup time. This is to ensure that the GFS