HP Serviceguard Extended Distance Cluster for Linux A.01.00 Deployment Guide, Second Edition, May 2008

Disaster Tolerance and Recovery in a Serviceguard Cluster
Disaster Tolerant Architecture Guidelines
Chapter 138
Protecting Data through Replication
The most significant losses during a disaster are the loss of access to
data, and the loss of data itself. You protect against this loss through
data replication, that is, creating extra copies of the data. Data
replication should:
Ensure data consistency by replicating data in a logical order so
that it is immediately usable or recoverable. Inconsistent data is
unusable and is not recoverable for processing. Consistent data may
or may not be current.
Ensure data currency by replicating data quickly so that a replica
of the data can be recovered to include all committed disk writes that
were applied to the local disks.
Ensure data recoverability so that there is some action that can be
taken to make the data consistent, such as applying logs or rolling a
database.
Minimize data loss by configuring data replication to address
consistency, currency, and recoverability.
Different data replication methods have different advantages with
regards to data consistency and currency. Your choice of which data
replication methods to use will depend on what type of disaster tolerant
architecture you require.
Off-line Data Replication
Off-line data replication is the method most commonly used today. It
involves two or more data centers that store their data on tape and either
send it to each other (through an express service, if need dictates) or
store it off-line in a vault. If a disaster occurs at one site, the off-line copy
of data is used to synchronize data and a remote site functions in place of
the failed site.
Because data is replicated using physical off-line backup, data
consistency is fairly high, barring human error or an untested corrupt
backup. However, data currency is compromised by the time delay in
sending the tape backup to a remote site.
Off-line data replication is fine for many applications for which recovery
time is not an issue critical to the business. Although data might be
replicated weekly or even daily, recovery could take from a day to a week