HP StorageWorks P9000 Cluster Extension Software Administrator Guide (TB534-96009, February 2011)
TIP: For more information about using Hyper-V Live Migration with P9000 Cluster Extension, see
the white paper Live Migration across data centers and disaster tolerant virtualization architecture
with HP StorageWorks Cluster Extension and Microsoft Hyper-VTM on the white papers website:
www.hp.com/storage/whitepapers.
Timing considerations for MSCS
P9000 Cluster Extension gives priority to disk array operations over cluster software operations.
If P9000 Cluster Extension invokes a disk pair resynchronization operation or gathers information
about the remote disk array, P9000 Cluster Extension waits until the requested status information
is reported. This ensures the priority of data integrity over cluster software failover processes. This
behavior can lead to failed P9000 Cluster Extension resources as described below:
• P9000 Cluster Extension uses RAID Manager instances to communicate with the remote disk
array. Depending on the setting of the RAID Manager instance timeout parameter and the
number of remote instances, the online operation could time out. This can occur if the local
RAID Manager instance cannot reach the remote RAID Manager instance.
• P9000 Cluster Extension tries to resynchronize disk pairs and waits until the RAID Manager
device/copy group is in PAIR state if the ApplicationStartup resource property is set to
RESYNCWAIT. RAID Manager and the P9000 or XP microcode fully support delta
resynchronization; however, the delta between the primary and secondary disks could be
large enough for the copy process to exceed the resource PendingTimeout value.
• The ResyncWaitTimeout object can cause P9000 Cluster Extension resources to fail if its value
is higher than the resource PendingTimeout value.
• If running in fence level ASYNC, the default value of AsyncTakeoverTimeout can cause the
resource to fail because its value exceeds the resource PendingTimeout value. The takeover
process for fence level ASYNC can take much longer when slow communications links are in
place.
To prevent takeover commands from being terminated by the resource PendingTimeout, measure
the time required to copy the installed disk array cache and adjust the resource PendingTimeout
value. When measuring the copy time, measure only the slowest link used for Continuous
Access Software. This ensures that the disk array cache can be transferred from the remote
disk array, even in the event of a single surviving replication link between the disk arrays.
In general, because the failover environment is dispersed into two (or more) data centers, the
failover time cannot be expected to be the same as that in a single data center with a single shared
disk device. Therefore, the following values of the P9000 Cluster Extension resource and the service
and application using that resource must be adjusted, based on failover tests performed to verify
the proper configuration setup: FailoverPeriod, RestartPeriod, PendingTimeout, LookAlive, and
IsAlive.
In addition, the service or application's FailoverPeriod value must be higher than the resource’s
RestartPeriod value, and both must be higher than the resource’s PendingTimeout value.
MSCS provides two parameters to adjust state change recognition/resolution:
• IsAlive
• LookAlive
P9000 Cluster Extension automatically calls the IsAlive function whenever the cluster service calls
the LookAlive function. Therefore, both functions must be set to the same value.
50 Configuring P9000 Cluster Extension for Windows