White Papers

Dell Storage for HPC with Intel Enterprise Edition 2.3 for Lustre sofware
As part of the performance characterizing of the solution, we explored performance impacts of
utilizing the most recent LSI SAS drivers (version P8) available during time of testing in comparison to
the native SAS drivers on RHEL/CentOS6.6. We also experimented with different driver and OS level
tunings. In addition, we experimented with caching states on the storage arrays and noted the impact
to overall performance of the solution. This paper will present our findings in the sections below.
4.1 N-to-N Sequential Reads / Writes
The sequential testing was done with the IOzone testing tool version 3.429. The throughput results
presented in Figure 11 are converted to MB/s. The file size selected for this testing was such that the
aggregate sample size from all threads was consistently 2TB. That is, sequential reads and writes had
an aggregate sample size of 2TB divided equally among the number of threads within that test. The
block size for IOzone was set to 1MB to match the 1MB Lustre request size.
Each file written was large enough to minimize cache effects from OSS and clients. In addition, the
other techniques to prevent cache effects helped to avoid them as well. The files written were
distributed evenly across the OSTs (Round Robin). This was to prevent uneven I/O loads on any single
SAS connection or OST, in the same way that a user would expect to balance a workload.
Figure 11: Sequential Reads / Writes Dell Storage for HPC with Intel EE for Lustre Solution
Figure 11 shows the sequential performance of the 960TB test configuration. With the test bed used,
write performance peaks slightly less than 7GB/sec while read performance peaks near 11GB/sec.
Tests were also performed to characterize an optimal pairing of a single client workload to OST ratio.
Tests ran from a single client utilizing a single thread targeting storage consisting of various numbers of
OST (2, 4, 6, 8 and 24 OSTs). While the performance against each of the different client to OST ratio
were relatively close in performance, we found that the ratio of single client to 8 OSTs consistently
yield slightly higher performance with reads at 1005MB/sec and writes at 1090MB/sec. The write and
read performance rises steadily as we increase the number of process threads up to 24 where we see
0
2000
4000
6000
8000
10000
12000
1 2 4 8 12 16 24 32 48 64 72 96 120 128 256
Throughput in MB/s
Number of concurrent threads
Iozone Sequential I/O - Dell Storage for HPC with Intel EE for Lustre
Solution
Write Read