Performance and Recommended Use of AD385A 10 Gigabit Ethernet SR Cards From Results on an HP rx8640 Server February 2008 Table of Contents Introduction ........................................................................................................... 2 Performance Summary ............................................................................................ 3 AD385A Throughput Performance ........................................................................... 4 Scalability Tests ........
Introduction Introduction The availability of next generation 10 Gigabit devices makes deployment of 10 Gigabit Ethernet even more attractive. 10 Gigabit is ideal for data center deployment and can provide up to ten times the performance of 1 Gigabit Ethernet at approximately four times the cost. 10 Gigabit Ethernet can free up I/O slots and enable server consolidation in deployments where I/O slot availability is a limiting factor.
Performance Summary Performance Summary The AD385A card provides exceptional performance when used in accordance with the recommendations in this paper. Tests with a Maximum Transmission Unit (MTU) of 9000 bytes and 1500 bytes show that the AD385A provides link-rate throughput for both transmit and receive tests when the recommended models are used. During scalability tests, the AD385A exhibited excellent linear scaling in performance when additional adapters were added to the system.
AD385A Throughput Performance AD385A Throughput Performance Following are the highlights of the excellent performance achieved when operating at 9000 MTU: • Receive traffic achieved a sustained throughput of 9.15 Gigabits/second, which is close to the link rate of a 10 Gigabit Ethernet link. The service demand (which is the amount of time it takes the CPU to handle one kilobyte of data) was 1.21µs/KByte. • The transmit throughput was 9.
AD385A Throughput Performance Following are the highlights of the impressive performance achieved during traffic testing at 1500 MTU (as shown in Figure 3): • Receive traffic achieved 9.11 Gigabits/second throughput with a service demand of 2.20µs/KByte. • A throughput of 9.47 Gigabits/second was achieved with Transmit traffic, which is the link rate at 1500 MTU for a 10 Gigabit Ethernet link; the service demand was 1.35µs/KByte. • Bi-directional traffic achieved 12.
AD385A Throughput Performance Table 1 shows the service demand. Table 1 10 Gigabit Ethernet Service Demand Service Demand (µs of CPU time consumed/KB) 1500 MTU 9000 MTU Transmit 1.35 0.86 Receive 2.20 1.21 Bi-directional 1.74 1.
Scalability Tests Scalability Tests The AD385A exhibited linear scaling in performance when additional adapters were added to the system. During scalability tests, adapters were installed on slots 3 and 4 on both I/O bays of the rx8640. The aggregate transmit throughput was 39.6 Gigabits/sec with four adapters installed on the server, when operating at 9000 MTU. The transmit performance with one, two, three and four AD385A adapters is shown in Figure 4.
Scalability Tests An aggregate throughput of 37 Gigabits/second was achieved for receive traffic with four adapters when operating at 9000 MTU. The receive performance with one, two, three and four AD385A adapters is shown in Figure 5. Figure 5 10 Gigabit Ethernet Receive Scalability Tests 10GigE Receive Scaling on rx8640 45000 40000 10^6 bits per second 35000 30000 25000 20000 15000 10000 5000 0 1 adapter 2 adapters 3 adapters 4 adapters 1500MTU 9107.88 18393.67 27507.1 35120.
Scalability Tests The aggregate bi-directional throughput was about 43 Gigabits/second with four adapters when operating at 9000 MTU. The bi-directional throughput with one, two, three, and four AD385A adapters is shown Figure 6. Figure 6 10 Gigabit Ethernet Bi-directional Scalability Tests 10GigE Bi-directional Scaling on rx8640 45000 40000 10^6 bits per second 35000 30000 25000 20000 15000 10000 5000 0 9 1 adapter 2 adapters 3 adapters 4 adapters 1500MTU 12026.17 23295.23 33965.02 41707.
Scalability Tests Request-Response tests were also run to demonstrate the capability of the adapter to handle traffic of this nature. Again, tests were run with one, two, three and four AD385A adapters. The aggregate transactions that could be sustained with the processors running close to saturation were captured. These are shown in Figure 7.
Recommended Use Based on Performance and Design Recommended Use Based on Performance and Design HP recommends the following usage model to achieve the best performance: • Run the AD385A cards in the highest performing PCI-X slots. Slots 3, 4, 5, and 6 on each of the I/O bays are the recommended high performance PCI-X slots in the HP Integrity rx8640 used in our performance testing. Please refer to the appropriate Server Installation Guide for information on how to identify high performance slots.
How We Measured 10 Gigabit Ethernet Efficiency How We Measured 10 Gigabit Ethernet Efficiency This article highlights the AD385A throughput. Throughput is the data transfer rate, or the quantity of data transferred from one system to another in a given amount of time. In this article, it’s shown for one-way transfers as well as two-way. Throughput measures how well programs run with a certain workload and how quickly user requests can be handled.
How We Measured 10 Gigabit Ethernet Efficiency Table 2 Products Used in the Performance Measurement Tests Servers Tested rx8640 Server. 2-cell configuration. Eight 1.6 GHz CPU Itanium2 dual-core processors with 18 MB cache Four sockets per cell System Memory: 32GB. Eight DIMM slots in each of the cells were populated. Operating System - HP-UX 11i v3 of Sep 2007 (B.11.31.0709). Cards Tested AD385A PCI-X 266MHz 10 Gigabit Ethernet card • PCI-X (64-bit, 266MHz, 3.
Features and Benefits of the AD385A Features and Benefits of the AD385A Features and benefits of the AD385A include: • PCI-X operation in 266MHz, 64-bit mode. • Conforms to IEEE 10GBASE-SR using multi-mode fiber. Operating distances from 7 to 984 feet (2 to 300 meters). • Supports several features that improve CPU utilization: — Jumbo Frames with a maximum transmission unit (MTU) of 9000 bytes. — On board TCP Segmentation Offload (TSO) of IPv4. — On board Checksum Offload (CKO) for TCP, UDP and IPv4.
For More Information © 2008 Hewlett-Packard Company, L.P. The information contained herein is subject to change without notice. The only warranties for HP products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. HP shall not be liable for technical or editorial errors or omissions contained herein.