Designing Disaster Recovery Clusters using Metroclusters and Continentalclusters, Reprinted October 2011 (5900-1881)

ManualsBrandsHP ManualsSoftwareHP Serviceguard Metrocluster with EMC SRDF

191

192

193

194

195

196

197

198

199

200

• Failure of the entire secondary Data Center for a given application package

• Failure of the secondary P9000 or XP Series disk array for a given application package while

the application is running on a primary host

Following is a partial list of failures that require full resynchronization to restore disaster-tolerant

data protection. Full resynchronization is automatically initiated for these failures by moving the

application package back to its primary host after repairing the failure:

• Failure of the entire primary data center for a given application package

• Failure of all of the primary hosts for a given application package

• Failure of the primary P9000 or XP Series disk array for a given application package

• Failure of all Continuous Access links with restart of the application on a secondary host

Pairs must be manually recreated if both the primary and secondary P9000 or XP Series disk array

are in SMPL (simplex) state. Make sure you periodically review the files syslog.log and

/etc/cmcluster/pkgname/pkgname.log for messages, warnings and recommended actions.

It is recommended to review these files after system, data center, or application failures.

Full resynchronization must be manually initiated after repairing the following failures:

• Failure of the secondary P9000 or XP Series disk array for a given application package

followed by application startup on a primary host

• Failure of all Continuous Access links with Fence Level NEVER and ASYNC with restart of the

application on a primary host

Using the pairresync Command

The pairresync command can be used with special options; after a failover in which the recovery

site has started the application, and has processed transaction data on the disk at the recovery

site, but the disks on the primary site are intact. After the Continuous Access link is fixed, use the

pairresync command in one of the following two ways depending on which site you are on:

• pairresync -swapp—from the primary site.

• pairresync -swaps—from the failover site.

These options take advantage of the fact that the recovery site maintains a bit-map of the modified

data sectors on the recovery array. Either version of the command will swap the personalities of

the volumes, with the PVOL becoming the SVOL and SVOL becoming the PVOL. With the

personalities swapped, any data that has been written to the volume on the failover site (now

PVOL) are then copied back to the SVOL (now running on the primary site). During this time the

package continues running on the failover site. After resynchronization is complete, you can halt

the package on the failover site, and restart it on the primary site. Metrocluster will then swap the

personalities between the PVOL and the SVOL, returning PVOL status to the primary site.

NOTE: The preceding steps are automated provided the default value of 1 is being used for the

auto variable AUTO_PSUEPSUS. Once the Continuous Access link failure has been fixed, the user

only needs to halt the package on the target disk site and restart on the source disk site. However,

if you want to reduce the amount of application downtime, you should manually invoke

pairresync before failback.

Failback

After resynchronization is complete, you can halt the package on the failover site, and restart it

on the primary site. Metrocluster will then swap the personalities between the PVOL and the SVOL,

returning PVOL status to the primary site.

Completing and Running a Metrocluster Solution with Continuous Access P9000 or XP 195