White Papers

Table Of Contents
Changes to BGP Multipath
When the system becomes active after a fast-boot restart, a change has been made to the BGP multipath and ECMP behavior. The
system delays the computation and installation of additional paths to a destination into the BGP routing information base (RIB) and
forwarding table for a certain period of time. Additional paths, if any, are automatically computed and installed without the need for any
manual intervention in any of the following conditions:
After 30 seconds of the system returning online after a restart
After all established peers have synchronized with the restarting system
A combination of the previous two conditions
One possible impact of this behavior change is that if the amount of trac to a destination is higher than the volume of trac that can be
carried over one path, a portion of that trac might be dropped for a short duration (30-60 seconds) after the system comes up.
Delayed Installation of ECMP Routes Into BGP
The current FIB component of Dell EMC Networking OS has some inherent ineciencies when handling a large number of ECMP routes
(i.e., routes with multiple equal-cost next hops). To circumvent this for the conguration of fast boot, changes are made in BGP to delay
the installation of ECMP routes. This is done only if the system comes up through a fast boot reload. The BGP route selection algorithm
only selects one best path to each destination and delays installation of additional ECMP paths until a minimum of 30 seconds has elapsed
from the time the rst BGP peer is established. Once this time has elapsed, all routes in the BGP RIB are processed for additional paths.
While the above change will ensure that at least one path to each destination gets into the FIB as quickly as possible, it does prevent
additional paths from being used even if they are available. This downside has been deemed to be acceptable.
RDMA Over Converged Ethernet (RoCE) Overview
This functionality is supported on the platform.
RDMA is a technology that a virtual machine (VM) uses to directly transfer information to the memory of another VM, thus enabling VMs
to be connected to storage networks. With RoCE, RDMA enables data to be forwarded without passing through the CPU and the main
memory path of TCP/IP. In a deployment that contains both the RoCE network and the normal IP network on two dierent networks,
RRoCE combines the RoCE and the IP networks and sends the RoCE frames over the IP network. This method of transmission, called
RRoCE, results in the encapsulation of RoCE packets to IP packets. RRoCE sends Inni Band (IB) packets over IP. IB supports input and
output connectivity for the internet infrastructure. Inni Band enables the expansion of network topologies over large geographical
boundaries and the creation of next-generation I/O interconnect standards in servers.
When a storage area network (SAN) is connected over an IP network, the following conditions must be satised:
Faster Connectivity: QoS for RRoCE enables faster and lossless nature of disk input and output services.
Lossless connectivity: VMs require the connectivity to the storage network to be lossless always. When a planned upgrade of the
network nodes happens, especially with top-of-rack (ToR) nodes where there is a single point of failure for the VMs, disk I/O operations
are expected to occur in 20 seconds. If disk in not accessible in 20 seconds, unexpected and undened behavior of the VMs occurs.
You can optimize the booting time of the ToR nodes that experience a single point of failure to reduce the outage in trac-handling
operations.
RRoCE is bursty and uses the entire 10-Gigabit Ethernet interface. Although RRoCE and normal data trac are propagated in separate
network portions, it may be necessary in certain topologies to combine both the RRoCE and the data trac in a single network structure.
RRoCE trac is marked with dot1p priorities 3 and 4 (code points 011 and 100, respectively) and these queues are strict and lossless. DSCP
code points are not tagged for RRoCE. Both ECN and PFC are enabled for RRoCE trac. For normal IP or data trac that is not RRoCE-
Flex Hash and Optimized Boot-Up
341