HP Fabric Clustering System for InfiniBand™ Interconnect Performance on HP-UX 11iv2

Confidential Page 5 1/28/2005

Point-to-point Latency (Send/Receive programming model) on rx2600

(2CPUs/1.5 GHz/4 GB RAM) and rx4640 (4 CPUs/1.5GHz/2GB RAM)

1 10 100 1000 10000 100000

Message Length (in bytes)

Latency (usec)

Point-to-point Latency (usec)

on rx2600

Point-to-point Latency (usec)

on rx4640

The latency values remain virtually unaffected when the results are obtained from two independent pairs of HCAs

plugged into rx4640 servers.

As the number of parallel latency streams that share the same HCA increases, latency suffers as the message size

increases. The standard deviation of the latencies for various streams from the mean latency depends on the number

of processors on the server. Given below is an illustration of the impacts of 3 parallel streams on mean latency and

standard deviation.

Parallel Latency Streams: Mean Latency and Standard Deviation (rx4640/3

CPUs/1.5 GHz/2 GB RAM) - Send/Receive programming model

100

120

140

1 10 100 1000 10000 100000

Message Length (bytes)

Latency (usec)

0.02

0.04

0.06

0.08

0.1

0.12

0.14

0.16

Standard Deviation from mean

latency

Mean Latency of 3 parallel latency

streams

Single Stream Latency

Standard Deviation from the

mean latency

As the number of parallel latency streams, running on the same HCA on a server, start exceeding the number of

available processors on the server, process scheduling issues will have an adverse impact on the mean latency and

standard deviation values. Process scheduling issues also impact parallel latency streams running on different HCAs,

when the number of such streams start exceeding the number of processors on the server.