Designing Disaster Recovery Clusters using Metroclusters and Continentalclusters, Reprinted October 2011 (5900-1881)

#### #### definitions will help in making the decision #### ####
#### #### whether or not to issue the cmrecovercl(1m) #### ####
#### #### command. Each monitoring definition specifies#### ####
#### #### a cluster event along with the messages #### ####
#### #### that should be sent to system administrators #### ####
#### #### or other IT staff. #### ####
#### #### All messages are appended to the default log #### ####
#### #### /var/opt/resmon/log/cc/eventlog as well as to#### ####
#### #### the destination you specify below. #### ####
#### #### A cluster event takes place when a monitor #### ####
#### #### that is located on one cluster detects a #### ####
#### #### significant change in the condition of #### ####
#### #### another cluster. The monitored cluster #### ####
#### #### conditions are: #### ####
#### #### UNREACHABLE - the cluster is unreachable. #### ####
#### #### This will occur when the communication link #### ####
#### #### to the cluster has gone down, as in a WAN #### ####
#### #### failure, or when the all nodes in the #### ####
#### #### cluster have failed. #### ####
#### #### DOWN - the cluster is down but nodes are #### ####
#### #### responding. This will occur when the cluster #### ####
#### #### is halted, but some or all of the member #### ####
#### #### nodes are booted and communicating with the #### ####
#### #### monitoring cluster. #### ####
#### #### UP - the cluster is up. #### ####
#### #### ERROR - there is a mismatch of cluster #### ####
#### #### versions or a security error. #### ####
#### #### A change from one of these conditions to #### ####
#### #### another one is a cluster event. You can #### ####
#### #### define alert or alarm states based on the #### ####
#### #### length of time since the cluster event was #### ####
#### #### observed. Some events are noteworthy at the #### ####
#### #### time they occur, and some are noteworthy #### ####
#### #### when they persist over time. Setting the #### ####
#### #### elapsed time to zero results in a message #### ####
#### #### being sent as soon as the event takes place. #### ####
#### #### Setting the elaspsed time to 5 minutes results#### ####
#### #### in a message being sent when the condition #### ####
#### #### has persisted for 5 minutes. #### ####
#### #### An alert is intended as informational only. #### ####
#### #### Alerts may be sent for any type of cluster #### ####
#### #### condition. For an alert, a notification is #### ####
#### #### sent to a system administrator or other #### ####
#### #### destination. Alerts are not intended to #### ####
#### #### indicate the need for recovery. The #### ####
#### #### cmrecovercl(1m) command is disabled. #### ####
#### #### #### ####
#### #### An alarm is an indication that a condition ####
#### #### exists that may require recovery. For an ####
#### #### alarm, a notification is sent, and in ####
#### #### addition, the cmrecovercl(1m) command is ####
#### #### enabled for immediate execution, allowing ####
#### #### the administrator to carry out cluster ####
#### #### recovery. An alarm can only be defined for ####
#### #### an UNREACHABLE or DOWN condition in the ####
#### #### monitored cluster. ####
#### #### A notification defines a message that is ####
#### #### appended to the log file ####
#### #### /var/opt/resmon/log/cc/eventlog and sent ####
#### #### to other specified destinations, including ####
#### #### email addresses, SNMP traps, the system ####
#### #### console, or the syslog file. The message ####
#### #### string in a notification can be no more than ####
#### #### 170 characters. Enter notifications in one of ####
#### #### the following forms: ####
Building the Continentalclusters Configuration 83