Designing Disaster Recovery Clusters using Metroclusters and Continentalclusters, Reprinted October 2011 (5900-1881)

Table 6 Monitored States and Possible Causes (continued)
Network-related CausesCluster-related CausesCluster Event (Old state ->
New state)
Network came up and the cluster was
already running
Cluster nodes were rebooted and the cluster
started
Unreachable -> Up
Network problem was fixed, cluster is
up
Error resolved, cluster is upError -> Up
NOTE: There is only one condition under which cmclsentryd will determine that the cluster
has Error status: all nodes are unreachable except those which have Serviceguard Error status. (If
any nodes are Down or Up, then the cluster status will take one of those values, rather than Error.)
Interpreting the Significance of Cluster Events
Because some cluster events (for example, Up -> Unreachable) can be caused by changes in either
a cluster state or a network state, additional independent information is required to achieve the
primary objective of determining whether you need to recover a cluster’s applications. Sources of
independent information include:
Contact with the network provider
Contact with the administrator of the monitored cluster
Contact with local cluster administrator
Contact with company executives
When problematic cluster events persist, obtain as much information as possible, including
authorization to recover, if your business practices require this, and then issue the Continentalclusters
recovery command, cmrecovercl.
How Notifications Work
A central part of the operation of Continentalclusters is the transmission of notifications following
the detection of a cluster event. Notifications occur at specifically coded times, and at two different
levels:
Alert — when a cluster event should be considered noteworthy.
Alarm — when an event shows evidence of a cluster failure.
Notifications are typically sent as:
Email messages
SNMP traps
Text log files
OPC messages to OpenView IT/Operations
In addition, notifications are sent to the eventlog file located in the /var/opt/resmon/log/
cc directory on the system where monitoring is taking place.
NOTE: An email message can be sent to an address supplied by a pager service that will forward
the message to a specified pager system. (Contact your pager service provider for more information.)
Alerts
Alerts are intended as informational. Some typical uses of alerts include:
Notification that a cluster has been halted for a significant amount of time.
Notification that a cluster has come up after being down or unreachable.
Understanding Continentalclusters Concepts 41