Understanding and Designing Serviceguard Disaster Recovery Architectures

Figure 1 High Availability Architecture
Node 2
Client Connections
Client Connections
pkg B
Node 1
Node 1
Node 2
pkg B
pkg A
pkg A
Node 1 fails
pkg B mirrors
pkg A mirrors
pkg A fails over to node 2
This architecture, which is typically implemented on one site in a single data center, is sometimes
called a local cluster. For some installations, the level of protection provided by a local cluster is
insufficient. Consider the order processing center where power outages are common during harsh
weather. Or consider the systems running the stock market, where multiple system failures, have
a significant financial impact. For these types of installations, and many more like them, it is
important to protect not only against single points of failure, but against multiple points of failure
(MPOF), also against single massive failures that cause many components to fail, such as the failure
of a data center, of an entire site, or of a small area. A sdata center, in the context of disaster
recovery, is a physically proximate collection of nodes and disks, usually all in one room.
Creating clusters that are resistant to multiple points of failure or single massive failures require a
unique type of cluster architecture called a disaster recovery architecture. This architecture provides
you with the ability to fail over automatically to another part of the cluster or manually to a different
cluster after disasters. Specifically, the disaster recovery cluster provides appropriate failover if a
disaster causes an entire data center to fail, as shown in Figure 2.
What is a Disaster Recovery Architecture? 7