User`s guide

Chapter XI. UDP Segmentation Offload and Pacing
Chelsio T5/T4 Unified Wire For Linux Page 173
1. Introduction
Chelsio’s T5/T4 series of adapters provide UDP segmentation offload and per-stream rate
shaping to drastically lower server CPU utilization, increase content delivery capacity, and
improve service quality.
Tailored for UDP content, UDP Segmentation Offload (USO) technology moves the processing
required to packetize UDP data and rate control its transmission from software running on the
host to the network adapter. USO increases performance and dramatically reduces CPU
overhead, allowing significantly higher capacity using the same server hardware. Without USO
support, UDP server software running on the host needs to packetize payload into frames,
process each frame individually through the network stack and schedule individual frame
transmission, resulting in millions of system calls, and packet traversals through all protocol
layers in the operating system to the network device. In contrast, USO implements the network
protocol stack in the adapter, and the host server software simply hands off unprocessed UDP
payload in large I/O buffers to the adapter.
The following figure compares the traditional datapath on the left to the USO datapath on the
right, showing how per-frame processing is eliminated. In this example, the video server pushes
5 frames at a time. In an actual implementation, a video server pushes 50 frames or more in
each I/O, drastically lowering the CPU cycles required to deliver the content.
Pacing is beneficial for several reasons, one example is for Content Delivery Networks
(CDNs)/Video On Demand (VOD) providers to avoid receive buffer overflows, smooth out
network traffic, or to enforce Service Level Agreements (SLAs). Without dedicated hardware
based pacing support, the video server must perform this in software, which is a CPU intensive
task and can be prohibitive at 10Gb and higher rates.