HP StorageWorks P9000 Cluster Extension Software Administrator Guide (TB534-96009, February 2011)

TIP: For more information about using Hyper-V Live Migration with P9000 Cluster Extension, see

the white paper Live Migration across data centers and disaster tolerant virtualization architecture

with HP StorageWorks Cluster Extension and Microsoft Hyper-VTM on the white papers website:

www.hp.com/storage/whitepapers.

Timing considerations for MSCS

P9000 Cluster Extension gives priority to disk array operations over cluster software operations.

If P9000 Cluster Extension invokes a disk pair resynchronization operation or gathers information

about the remote disk array, P9000 Cluster Extension waits until the requested status information

is reported. This ensures the priority of data integrity over cluster software failover processes. This

behavior can lead to failed P9000 Cluster Extension resources as described below:

• P9000 Cluster Extension uses RAID Manager instances to communicate with the remote disk

array. Depending on the setting of the RAID Manager instance timeout parameter and the

number of remote instances, the online operation could time out. This can occur if the local

RAID Manager instance cannot reach the remote RAID Manager instance.

• P9000 Cluster Extension tries to resynchronize disk pairs and waits until the RAID Manager

device/copy group is in PAIR state if the ApplicationStartup resource property is set to

RESYNCWAIT. RAID Manager and the P9000 or XP microcode fully support delta

resynchronization; however, the delta between the primary and secondary disks could be

large enough for the copy process to exceed the resource PendingTimeout value.

• The ResyncWaitTimeout object can cause P9000 Cluster Extension resources to fail if its value

is higher than the resource PendingTimeout value.

• If running in fence level ASYNC, the default value of AsyncTakeoverTimeout can cause the

resource to fail because its value exceeds the resource PendingTimeout value. The takeover

process for fence level ASYNC can take much longer when slow communications links are in

place.

To prevent takeover commands from being terminated by the resource PendingTimeout, measure

the time required to copy the installed disk array cache and adjust the resource PendingTimeout

value. When measuring the copy time, measure only the slowest link used for Continuous

Access Software. This ensures that the disk array cache can be transferred from the remote

disk array, even in the event of a single surviving replication link between the disk arrays.

In general, because the failover environment is dispersed into two (or more) data centers, the

failover time cannot be expected to be the same as that in a single data center with a single shared

disk device. Therefore, the following values of the P9000 Cluster Extension resource and the service

and application using that resource must be adjusted, based on failover tests performed to verify

the proper configuration setup: FailoverPeriod, RestartPeriod, PendingTimeout, LookAlive, and

IsAlive.

In addition, the service or application's FailoverPeriod value must be higher than the resource’s

RestartPeriod value, and both must be higher than the resource’s PendingTimeout value.

MSCS provides two parameters to adjust state change recognition/resolution:

• IsAlive

• LookAlive

P9000 Cluster Extension automatically calls the IsAlive function whenever the cluster service calls

the LookAlive function. Therefore, both functions must be set to the same value.

50 Configuring P9000 Cluster Extension for Windows