Designing Disaster Recovery Clusters using Metroclusters and Continentalclusters, Reprinted October 2011 (5900-1881)

At time T0, all the SRDF links go down. The application continues to run on the R1 side. At time
T1, the SRDF links are restored, and at T2 a manual resynchronization is started to resync new
data from the R1 to the R2 side. At time T3, while resynchronization is in progress, the R1 site
fails, and the application starts up on the R2 side. Since the resynchronization did not complete
when there was a failure on the R1 side, the data on the R2 side is corrupt.
Using the BCV in Resynchronization
In the case described above, you use the business continuity volumes, which protect against a
rolling disaster. First split off a consistent copy of the data at the recovery site, and then perform
the re-synchronization. After the re-synchronization is complete, re-establish the BCV mirroring. To
protect data consistency on R2 in rolling disaster, use the following procedures:
1. Before starting the re-synchronization from R1 to R2 side, it is necessary to disable the package
switch capability to prevent the package automatically fail over to R2 if a new disaster occurs
when the re-sync is still in progress. To disable the package switching on the R2 nodes.
# cmmodpkg -d pkgname -n node_name
2. Split the BCV in the secondary Symmetrix from the mirror group to save a good copy of the
data from nodes on R2 side.
# symmir -g dgname split
Alternatively, from node on R1 side.
# symmir -g dgname split -rdf
3. Begin to resynchronize the data from R1 to R2 devices.
# symrdf -g dgname est
4. After the resynchronization is completed, enable the package switching on the node on R2
side.
# cmmodpkg -e pkgname -n node_name
5. Re-establish the BCV to R2 devices on R2 as a mirror.
# symmir -g dgname -full est
Alternatively, from node on R1 side.
# symmir -g dgname -full est -rdf
In Metrocluster with EMC SRDF environment, following the resynchronization process described
above, which prevents the package from automatically failing over and starting on the R2 side if
a disaster takes place when the resync is in progress. This ensures the package would not
automatically start and operate on the inconsistent data in the event of a rolling disaster.
As demonstrated above, the re-sync is a manual process and initiated by an operator after the
links are fixed. The pairstate of the devices should be Synchronized for SRDF/Synchronous or
Consistent for SRDF/Asynchronous when the re-sync is completed. Check the state and ensure that
the re-sync is completed before enabling the package switch.
If Metrocluster with EMC SRDF is used in Continentalclusters, it is not necessary to disable the
package switch on the nodes on recovery site since each site has its own cluster. However, when
the re-sync is in progress, make sure the recovery site will not start the recovery operation in the
event of a disaster occurring on the primary site. Use the following procedures to protect data
consistency on R2 in a Continentalclusters environment:
1. Split the BCV in the secondary Symmetrix from the mirror group to save a good copy of the
data from nodes on R2 side:
# symmir -g dgname split
Alternatively, from node on R1 side.
286 Building Disaster Recovery Serviceguard Solutions Using Metrocluster with EMC SRDF