Understanding and Designing Serviceguard Disaster Recovery Architectures

Figure 1 High Availability Architecture

Node 2

Client Connections

pkg B

Node 1

Node 2

pkg B

pkg A

Node 1 fails

pkg B mirrors

pkg A mirrors

pkg A fails over to node 2

This architecture, which is typically implemented on one site in a single data center, is sometimes

called a local cluster. For some installations, the level of protection provided by a local cluster is

insufficient. Consider the order processing center where power outages are common during harsh

weather. Or consider the systems running the stock market, where multiple system failures, have

a significant financial impact. For these types of installations, and many more like them, it is

important to protect not only against single points of failure, but against multiple points of failure

(MPOF), also against single massive failures that cause many components to fail, such as the failure

of a data center, of an entire site, or of a small area. A sdata center, in the context of disaster

recovery, is a physically proximate collection of nodes and disks, usually all in one room.

Creating clusters that are resistant to multiple points of failure or single massive failures require a

unique type of cluster architecture called a disaster recovery architecture. This architecture provides

you with the ability to fail over automatically to another part of the cluster or manually to a different

cluster after disasters. Specifically, the disaster recovery cluster provides appropriate failover if a

disaster causes an entire data center to fail, as shown in Figure 2.

What is a Disaster Recovery Architecture? 7