Understanding and Designing Serviceguard Disaster Recovery Architectures

NOTE: “Metrocluster and Continentalclusters” (page 20), provides an overview of HP’s

implementation of Metropolitan Cluster and Continental cluster while Chapter 3 provides an

overview of Extended Distance Clusters.

Disaster Recovery Architecture Guidelines

The disaster recovery architectures represent a shift away from the massive central data centers

and towards more distributed data processing facilities. While each architecture is different to suit

specific availability needs, there are a few basic guidelines to design a disaster recovery architecture

so that it protects against the loss of an entire data center:

• Protect nodes through geographic dispersion

• Protect data through replication

• Use alternative power sources

• Create highly available networks

These guidelines are in addition to the standard high-availability guidelines of redundant components

such as multipathing, network cards, power supplies, and disks.

Protecting Nodes through Geographic Dispersion

Redundant nodes in a disaster recovery architecture must be geographically dispersed. If they are

in the same data center, it is not a disaster recovery architecture. Figure 2 (page 8) shows a

cluster architecture with nodes in two data centers: A and B. If all nodes in data center A fail,

applications can fail over to the nodes in data center B and continue to provide service for clients.

The type of disaster you are protecting against and the available technology determines the location

of the nodes, the nodes can be as close as another room in the same building, or as far away as

another city. The minimum recommended dispersion is a single building with redundant nodes in

different data centers using different power sources. Specific architectures based on geographic

dispersion are discussed in “Understanding types of disaster recovery clusters” (page 8).

Protecting Data through Replication

During a disaster you might lose access to data or the data itself. You can protect against this loss

through data replication, that is, create extra copies of the data. During Data replication you must

verify the following:

• Ensure data consistency by replicating data in a logical order so that it is immediately usable

or recoverable. Inconsistent data is unusable and is not recoverable for processing. Consistent

data might not be current.

• Ensure data currency by replicating data quickly so that a replica of the data can be recovered

to include all committed disk writes that were applied to the local disks.

• Ensure data recoverability so that there is some action that can be taken to make the data

consistent, such as applying logs or rolling a database.

• Minimize data loss by configuring data replication to address consistency, currency, and

recoverability.

The advantages of the different data replication methods may vary with regards to data consistency

and currency. The type of data replication methods that you select to use is dependant on the type

of disaster recovery architecture you require.

Offline Data Replication

Offline data replication is the most common method used today. It involves two or more data

centers that store their data on tape and either send it to each other (via an express service, if need

Disaster Recovery Architecture Guidelines 9