HP Fabric Clustering System for InfiniBand™ Interconnect Performance on HP-UX 11iv2
Confidential Page 11 1/28/2005
Transm it Bandw idth on rx4640
(Msg. Size - 64 KB, Processor Speed: 1.5GHz, 3 CPUs.)
700
710
720
730
740
750
760
123
Num ber of Parallel Stream s
Transmit Side Bandwidth (MB/s)
0
0.5
1
1.5
2
2.5
Transmit Side CPU Utilization (%)
Transmit Bandw idth
Series2
Non-Optimal Configurations
HCAs when plugged in single rope PCI-X slots on a rx2600 (2 CPUs/1.5 GHz/4GB RAM) in a point-to-point
configuration offer 1-way latency of 7.8usec, as against 7.0usec latency on dual rope slots in a similar configuration.
HP-UX clustering interconnect solution saturates the single rope PCI-X slot quite fast and can only offer a bandwidth of
452 MB/s for a 4MB message, as against 760MB/s bandwidth on dual ropes slots in a similar configuration.
HP Integrity server rx4640 has shared slots. Shared slots normally operate at 66MHz and can be lowered to 33MHz
if a 33MHz card is plugged into the other shared slot on the server. Thus usage of shared slots in high performance
oriented HP-UX Fabric Clustering System interconnect solution configurations is not recommended.
Summary
HP-UX Fabric Clustering System interconnect solution offers 4.6usec 1-way latency and a single stream bandwidth of
760MB/s, all using industry standard technologies and off-the-shelf components. The solution scales well across
multiple HCAs as well as multiple connections enabling an overall improvement for real world applications. The HP-
UX Fabric Clustering System interconnect solution supports both point-to-point and switch configurations without any
negative impact on the application performance.
References
Refer to whitepapers on HP Fabric Clustering System found at www.hp.com.
www.hp.com/go/mpi
www.infinibandta.org