Veritas Storage Foundation 5.1 SP1 for Oracle RAC Administrator"s Guide (5900-1512, April 2011)

ManualsBrandsHP ManualsSoftwareHP-UX 11i Volume Management (LVM/VxVM) Software

201

202

203

204

205

206

207

208

209

210

Table 3-20

Fencing startup issues on SF Oracle RAC cluster (client cluster)

nodes (continued)

Description and resolutionIssue

Assume the following situations to understand preexisting split-brain in server-based

fencing:

■ There are three CP servers acting as coordination points. One of the three CP servers

then becomes inaccessible. While in this state, also one client node leaves the cluster.

When the inaccessible CP server restarts, it has a stale registration from the node

which left the SF Oracle RAC cluster. In this case, no new nodes can join the cluster.

Each node that attempts to join the cluster gets a list of registrations from the CP

server. One CP server includes an extra registration (of the node which left earlier).

This makes the joiner node conclude that there exists a preexisting split-brain between

the joiner node and the node which is represented by the stale registration.

■ All the client nodes have crashed simultaneously, due to which fencing keys are not

cleared from the CP servers. Consequently, when the nodes restart, the vxfen

configuration fails reporting preexisting split brain.

These situations are similar to that of preexisting split-brain with coordinator disks, where

the problem is solved by the administrator running the vxfenclearpre command. A

similar solution is required in server-based fencing using the cpsadm command.

Run the cpsadm command to clear a registration on a CP server:

# cpsadm -s cp_server -a unreg_node

-c cluster_name -n nodeid

where cp_server is the virtual IP address or virtual hostname on which the CP server is

listening, cluster_name is the VCS name for the SF Oracle RAC cluster, and nodeid specifies

the node id of SF Oracle RAC cluster node. Ensure that fencing is not already running on

a node before clearing its registration on the CP server.

After removing all stale registrations, the joiner node will be able to join the cluster.

Preexisting split-brain

Issues during online migration of coordination points

During online migration of coordination points using the vxfenswap utility, the

operation is automatically rolled back if a failure is encountered during validation

of coordination points from all the cluster nodes.

Validation failure of the new set of coordination points can occur in the following

circumstances:

■ The /etc/vxfenmode file is not updated on all the SF Oracle RAC cluster nodes,

because new coordination points on the node were being picked up from an

old /etc/vxfenmode file.

Troubleshooting SF Oracle RAC

Troubleshooting I/O fencing

214