VERITAS Volume Manager 3.5 Administrator's Guide (September 2002)
Overview of Cluster Volume Management
246 VERITAS Volume Manager Administrator’s Guide
Overview of Cluster Volume Management
In recent years, tightly coupled cluster systems have become increasingly popular in the
realm of enterprise-scale mission-critical data processing. The primary advantage of
clusters is protection against hardware failure. Should the primary node fail or otherwise
become unavailable, applications can continue to run by transferring their execution to
standby nodes in the cluster. This ability to provide continuous availability of service by
switching to redundant hardware is commonly termed failover.
Another major advantage of clustered systems is their ability to reduce contention for
system resources caused by activities such as backup, decision support and report
generation. Businesses can derive enhanced value from their investment in cluster
systems by performing such operations on lightly loaded nodes in the cluster rather than
on the heavilyloaded nodesthat answerrequests for service. This abilityto performsome
operations on the lightly loaded nodes is commonly termed load balancing.
The cluster functionality of VxVM works together with the cluster monitor daemon that is
provided by the host operating system. The cluster monitor informs VxVM of changes in
cluster membership. Each node starts up independently and has its own cluster monitor
plus its own copies of the operating system and VxVM with support for cluster
functionality. When a node joins a cluster, it gains access to shared disks. When a node
leaves a cluster, it no longer has access to shared disks. A node joins a cluster when the
cluster monitor is started on that node.
“Example of a 4-Node Cluster” on page 247illustrates a simple cluster arrangement
consisting of four nodes with similar or identical hardware characteristics (CPUs, RAM
and host adapters), and configured with identical software (including the operating
system). The nodes are fully connected by a private network and they are also separately
connected to shared external storage (either disk arrays or JBODs: just a bunch of disks) via
SCSI or Fibre Channel. Each node has two independent paths to these disks, which are
configured in one or more cluster-shareable disk groups.
The private network allows the nodes to share information about system resources and
about each other’s state. Using the private network, any node can recognize which other
nodes are currently active, which are joining or leaving the cluster, and which have failed.
The private network requires at least two communication channels to provide
redundancy against one of the channels failing. If only one channel were used, its failure
would be indistinguishable from node failure—a condition known as network partitioning.