HP VAN SDN Controller Administrator Guide

5 Region Configuration

Overview

This chapter describes the configuration needed to support High Availability (HA) for HP VAN

SDN controllers to OpenFlow switches. This is done by creating region configurations in the

controllers using the REST APIs provided by the Role Orchestration Service (ROS).

Putting the region configurations in place in a controller team ensures seamless failover and

failback among the configured controllers for the specified network devices in a region. That is,

when a master controller experiences a fault, the Role Orchestration Service ensures that a slave

controller immediately assumes the master role over the group of network devices to which the

failed controller was in the master role. Once the failed controller recovers and rejoins the team,

the Role Orchestration Service ensures restoration of this controller’s role; that is, the rejoining

controller takes back the role for which it was configured with respect to the other network

devices. If the controller was configured to operate as the master in a region, then it would be

restored to the master role. If it was configured to operate in the slave role, it would resume

operation in the slave role.

Once the region definition(s) are in place, the ROS ensures that a master controller is always

available to the respective network element(s) even if the configured master fails or there is a

disruption of the communication channel between the controller and the network device(s).

Note

All region configuration operations (create, update, refresh, and delete) using the REST API

require that every controller specified in the region, including the master controller and all

slave controllers, be in an active state. If any controller in the region is in a "down" state,

then the region configuration operations are disallowed.

Failover

ROS triggers the failover operation in two cases:

 Controller failure: The ROS detects a controller failure in a team through notifications from the

teaming subsystem. If ROS determines that the failed controller instance was a master for any

region, it immediately elects one of the backup (slave) controllers to assume the master role

over the affected region.

 Device disconnect: The ROS instance in a controller is notified of a communication failure

with network device(s) through the Controller Service notifications. It instantly communicates

with all ROS instances in the team to determine if the network device(s) in question are still

connected to any of the backup (slave) controllers within the team. If that is the case, it elects

one of the slaves to assume the master role over the affected network device(s).

Failback

When the configured master recovers from a failure and rejoins the team, or when the connection

from the disconnected device(s) with the original master is resumed, ROS initiates a failback