HP VAN SDN Controller Administrator Guide

62
5 Region Configuration
Overview
This chapter describes the configuration needed to support High Availability (HA) for HP VAN
SDN controllers to OpenFlow switches. This is done by creating region configurations in the
controllers using the REST APIs provided by the Role Orchestration Service (ROS).
Putting the region configurations in place in a controller team ensures seamless failover and
failback among the configured controllers for the specified network devices in a region. That is,
when a master controller experiences a fault, the Role Orchestration Service ensures that a slave
controller immediately assumes the master role over the group of network devices to which the
failed controller was in the master role. Once the failed controller recovers and rejoins the team,
the Role Orchestration Service ensures restoration of this controller’s role; that is, the rejoining
controller takes back the role for which it was configured with respect to the other network
devices. If the controller was configured to operate as the master in a region, then it would be
restored to the master role. If it was configured to operate in the slave role, it would resume
operation in the slave role.
Once the region definition(s) are in place, the ROS ensures that a master controller is always
available to the respective network element(s) even if the configured master fails or there is a
disruption of the communication channel between the controller and the network device(s).
Note
All region configuration operations (create, update, refresh, and delete) using the REST API
require that every controller specified in the region, including the master controller and all
slave controllers, be in an active state. If any controller in the region is in a "down" state,
then the region configuration operations are disallowed.
Failover
ROS triggers the failover operation in two cases:
Controller failure: The ROS detects a controller failure in a team through notifications from the
teaming subsystem. If ROS determines that the failed controller instance was a master for any
region, it immediately elects one of the backup (slave) controllers to assume the master role
over the affected region.
Device disconnect: The ROS instance in a controller is notified of a communication failure
with network device(s) through the Controller Service notifications. It instantly communicates
with all ROS instances in the team to determine if the network device(s) in question are still
connected to any of the backup (slave) controllers within the team. If that is the case, it elects
one of the slaves to assume the master role over the affected network device(s).
Failback
When the configured master recovers from a failure and rejoins the team, or when the connection
from the disconnected device(s) with the original master is resumed, ROS initiates a failback