Building Disaster Recovery Serviceguard Solutions Using Metrocluster with Continuous Access XP P9000 for Linux B.01.00.00

the volumes, with the PVOL becoming the SVOL and SVOL becoming the PVOL. With the
personalities swapped, any data that has been written to the volume on the recovery site (now
PVOL) are then copied back to the SVOL (now running on the primary site). During this time the
package continues running on the recovery site.
Failback to the primary datacenter
After resynchronization is complete, you can halt the package on the recovery site, and restart it
on the primary site. Metrocluster will then swap the personalities between the PVOL and the SVOL,
returning PVOL status to the primary site.
NOTE: The preceding steps are automated provided the default value of 1 is being used for the
auto variable AUTO_PSUEPSUS. Once the Continuous Access link failure has been fixed, the user
only needs to halt the package on the recovery site and restart on the primary site. However, if
you want to reduce the amount of application downtime, you must manually invoke pairresync
before failback.
Factors affecting package startup time
In a journal group, many journal volumes can be configured to hold a significant amount of the
journal data (host-write data). The package startup time might increase significantly when a
Metrocluster Continuous Access package fails over. Delay in package startup time occurs in these
situations:
1. When recovering from broken pair affinity. On failover, the SVOL pull all the journal data
from PVOL site. The time required to complete all data transfer to SVOL depends on the amount
of outstanding journal data in the PVOL and the bandwidth of the Continuous Access links.
2. When host I/O faster than Continuous Access data replication. The outstanding data not
being replicated to the SVOL is accumulated in journal volumes. Upon package fail over to
the SVOL site, the SVOL pull all the journal data from PVOL site. The completion of the all
data transfer to the SVOL depends on the bandwidth of the Continuous Access links and
amount of outstanding data in the PVOL journal volume.
Data maintenance with the failure of a Metrocluster with Continuous Access XP P9000 for Linux
failover
The following sections, “Swap takeover failure (asynchronous/journal mode)” and “Takeover
timeout for Continuous Access journal mode” describes data maintenance upon failure of a
Metrocluster with Continuous Access XP P9000 for Linux failover.
Swap takeover failure (asynchronous/journal mode)
When a device group pair state is SVOL-PAIR at a local site (site where the package is starting)
and is PVOL-PAIR at the remote site, the Metrocluster Continuous Access performs a swap takeover.
The swap takeover will fail if there is an internal error (For Example, cache or shared memory
failure) in the device group pair. In this case, if the AUTO-NONCURDATA is set to 0, the package
will not be started and the SVOL state is change to SVOL-PSUE (SSWS) by the takeover command.
The PVOL site either remains in PVOL-PAIR or is changed to PVOL-PSUE.
The SVOL is in SVOL-PSUE(SSWS) that is, the SVOL is read/write enabled and the data is usable
but not as current as PVOL.
In this case, either use FORCEFLAG to startup the package on SVOL site or fix the problem and
resume the data replication with the following procedures:
1. Split the device group pair completely (pairsplit -g <dg> -S).
2. Re-create a pair from original PVOL as source (use paircreate command).
3. Startup package on either the PVOL site or SVOL site.
Administering a Metrocluster using Continuous Access XP/P9000 replication 43