White Papers

HPC Application Performance
Study on 4S Servers
by Ranga Balimidi, Ashish K. Singh, and Ishan Singh
What can you do with a big bad 4-socket machine with 60 cores with up to 6TB memory in HPC? To help answer that
question, we conducted a performance study using several benchmark suites such as HPL, STREAM, WRF and
Fluent. This blog describes some of our results that help illustrate the possibilities. The server that we used for this
study is the Dell PowerEdge R920. This server supports the family of processors in the Intel architecture code named
Ivy Bridge EX.
The server configuration table outlines the configuration details used for this study as well as the configurations from
a previous study performed in June 2010 with the previous generation of technology. We use these two systems to
compare performance across technology refresh.
Server Configuration
Power Edge R920 Hardware
Processors 4 x Intel Xeon E7-4870v2 @ 2.3GHz (15 cores) 30M
cache 130W
Memor
y
512 GB =32 * 16GB 1333MHz RDIMMs
PowerEdge R910 Hardware
Processor 4 x Intel Xeon X7550 @ 2.00GHz (8 cores) 18M cache
130W
Memor
y
128GB = 32 * 4GB 1066MHz RDIMMs
Software and Firmware for PowerEdge R920
Operating System Red Hat Enterprise Linux 6.5 (kernel version 2.6.32-431.el6
x86
_
64)
Intel Compile
r
Version 14.0.2
Intel MKL Version 11.1
Intel MPI Version 4.1
BIOS Version 1.1.0
BIOS Settings System Profile set to Performance
(Lo
g
ical Processor disabled, Node Interleave disabled)
Benchmarks & Applications for PowerEdge R920
HPL v2.1, From Intel MKL v11.1, Problem size 90% of total
memor
y
.

Summary of content (7 pages)