REFERENCE ARCHITECTURES OF DELL EMC READY BUNDLE FOR HPC LIFE SCIENCES Refresh with 14th Generation servers ABSTRACT Dell EMC’s flexible HPC architecture for Life Sciences has been through a dramatic improvement with new Intel® Xeon® Scalable Processors. Dell EMC Ready Bundle for HPC Life Sciences equipped with better 14G servers, faster CPUs, and more memory bring a much higher performance in terms of throughput compared to the previous generation especially in genomic data processing.
TABLE OF CONTENTS EXECUTIVE SUMMARY ...........................................................................................................3 AUDIENCE ........................................................................................................................................ 3 INTRODUCTION ........................................................................................................................4 SOLUTION OVERVIEW .................................................................
EXECUTIVE SUMMARY Since Dell EMC announced Dell EMC HPC solution for Life Science in September 2016, the current Dell EMC Ready Bundle for HPC Life Sciences can process 485 genomes per dayi with 64x C6420s and Dell EMC Isilon F800 in our benchmarking. This is roughly a two-fold improvement from Dell EMC HPC System for Life Science v.1.
INTRODUCTION Although the successful completion of the Human Genome Project was announced on April 14, 2003 after a 13-year-long endeavor and numerous exciting breakthroughs in technology and medicine, there’s still a lot of work ahead for understanding and using the human genome.
The solutions are nearly identical for Intel® OPA and IB EDR versions except for a few changes in the switching infrastructure and network adapter. The solution ships in a deep and wide 48U rack enclosure, which helps to make PDU mounting and cable management easier.
Management o Dell EMC PowerEdge R440 for master node CPU: 2x Intel® Xeon® Gold 5118 @2.3GHz 12 cores (Skylake) Memory: 12x 8GB @2666 MHz Disk: 10x 1.8TB 10K RPM SAS 12Gbps 512e 2.5in Hot-plug Hard Drive in RAID 10 BIOS System Profile: Performance Optimized Logical Processor: Enabled Virtualization Technology: Disabled OS: RHEL 7.4 o Dell EMC PowerEdge R640 for login node and CIFS gateway (optional) CPU: 2x Intel Xeon Gold 6132 @2.
assistance of the two fence devices; and ensure that the failed server does not return to life without the administrator’s knowledge or control. Figure 2 NSS7.0-HA 480TB configuration; recommended PDU is APC switched Rack PDUs, model AP7921 Dell EMC PowerEdge R730 o CPU: 2x Intel Xeon E5-2660 v4 @2.
240 TB: One PowerVault MD3460 with 6 virtual disks 480 TB: One PowerVault MD3460 + One PowerVault MD3060e with 12 virtual disks Dell EMC Ready Bundle for HPC Lustre Storage The Dell EMC Ready Bundle for HPC Lustre Storage Solution, referred to as Dell HPC Lustre Storage is designed for academic and industry users who need to deploy a fully-supported, easy-to-use, high-throughput, scale-out and cost-effective parallel file system storage solution.
o o CPU: 2x Intel Xeon™ E5-2697 v4 @ 2.3GHz 18 cores Memory: 8x 16 GB @2400 MHz 2x Dell EMC PowerEdge R730 for OSS nodes o CPU: 2x Intel Xeon™ E5-2630V4 @ 2.20GHz 10 cores o Memory: 16x 16 GB @2133 MHz o OSS SAS controller: 4x SAS 12 Gbps HBA LSI 9300-8e o OSS Storage Array: 4x PowerVault MD3460 Disks: 240 3.5” 4 TB 7.2K RPM NL SAS 2x Dell EMC PowerEdge R730 for MDS nodes o CPU: 2x Intel Xeon™ E5-2630V4 @ 2.
Network Components Dell Networking H1048-OPF Intel® Omni-Path Architecture (OPA) is an evolution of the Intel® True Scale Fabric Cray Aries interconnect and internal Intel® IP [9]. In contrast to Intel® True Scale Fabric edge switches that support 36 ports of InfiniBand QDR-40Gbps performance, the new Intel® OmniPath fabric edge switches support 48 ports of 100Gbps performance. The switching latency for True Scale edge switches is 165ns175ns.
Reference Architecture Case 1 This reference architecture is configured with Intel® OPA fabric, Dell EMC Ready Bundle for HPC NFS Storage and Dell EMC Ready Bundle for HPC Lustre Storage. Dell EMC Ready Bundle for HPC NFS Storage, referred to as Dell NFS Storage Solution-High Availability configuration (NSS7.0-HA) is configure for a general usage such as a home directory while Dell EMC Ready Bundle for HPC Lustre Storage servers as a high-performance scratch storage.
Figure 9 Management Network Figure 10 shows high-speed interconnect with Intel® OPA. The network topology is 2:1 blocking fat tree which requires three Dell Network H1048-OPF 48 port switches.
Reference Architecture Case 2 Dell EMC Isilon storage, F800 All-flash is configured into Dell EMC Ready Bundle for HPC Life Sciences. In this reference architecture, the interconnect for the F800 has 8 ports of 40GbE connections. Since two 40GbE external connections in each node of the F800 can be aggregated, the Z9100-ON switch is configured to support two-port aggregation on each node. Hence, as shown in Figure 12, two 40GbE ports are aggregate to one port channel which connects to a node in the F800.
Figure 12 Interconnection with 10GbE (1:1 Non-blocking) The management network is identical to case 1 except for the connections to the Lustre storage (Figure 13).
Reference Architecture Case 3 This example shows Dell EMC Ready Bundle for HPC Life Sciences with Dell EMC Isilon Storage H600. The H600 is one of Dell EMC Isilon Hybrid Scale-out NAS storages. The details of the storage description can be found in the next sections. Figure 14 Dell EMC Ready Bundle for HPC Life Sciences with 10GbE fabric and H600 The management network for this reference architecture is identical to the one shown in Figure 13.
Figure 15 Interconnection with 10GbE (1:1 Non-blocking) and IB QDR Software Components Along with the hardware components, the solution includes the following software components: Bright Cluster Manager® BioBuilds Bright Cluster Manager Bright Computing is a commercial software that provides comprehensive software solutions for deploying and managing HPC clusters, big data clusters and OpenStack in the data center and in the cloud (6).
PERFORMANCE EVALUATION AND ANALYSIS GENOMICS/NGS DATA ANALYSIS PERFORMANCE A typical variant calling pipeline consists of three major steps 1) aligning sequence reads to a reference genome sequence; 2) identifying regions containing SNPs/InDels; and 3) performing preliminary downstream analysis. In the tested pipeline, BWA 0.7.2r1039 is used for the alignment step and Genome Analysis Tool Kit (GATK) is selected for the variant calling step.
Figure 17 Performances in 13/14 generation servers with Isilon and Lustre The number of compute nodes used for the tests are 64x C6420s and 63x C6320s (64x C6320s for testing H600). The number of samples per node was increased to get the desired total number of samples processed concurrently. For C6320 (13G), 3 samples per node was the maximum number of samples each node can process.
MOLECULAR DYNAMICS SIMULATION SOFTWARE PERFORMANCE Over the past decade, GPUs has become popular in scientific computing because of their great ability to exploit a high degree of parallelism. NVIDIA has a handful of life sciences applications optimized and to run on their general-purpose GPUs. Unfortunately, these GPUs can only be programmed with CUDA, OpenACC and the OpenCL framework.
Figure 19 Figure 19 AMBER JAC Benchmark Figure 20 AMBER STMV benchmark Figure 19 and Figure 20 illustrate AMBER’s results with DHFR and STMV dataset. On SXM2 system (Config K), AMBER scales weakly with 2 and 4 GPUs. Even though the scaling is not strong, V100 has noticeable improvement than P100, giving ~78% increase in single card runs, and 1x V100 is actually 23% faster than 4x P100. On the PCIe (Config G) side, one and two cards perform similar to SXM2; however, four cards’ results dropped sharply.
LAMMPS Large-scale Atomic/Molecular Massively Parallel Simulator (LAMMPS) is a classical molecular dynamics code and has potentials for solid-state materials (metals, semiconductors) and soft matter (biomolecules, polymers) and coarse-grained or mesoscopic systems. It can be used to model atoms or, more generically, as a parallel particle simulator at the atomic, meso, or continuum scale.
viruses and protein complexes at molecular resolution. A rapid vitrification at cryogenic temperature is the key step to avoid water molecule crystallization and forming amorphous solid that does almost no damage to the sample structure. Regular electron microscopy requires samples to be prepared in complex ways, and the sample preparations make hard to retaining the original molecular structures.
REFERENCES 1. Blueprint for High Performance Computing. Dell TechCenter. [Online] http://en.community.dell.com/techcenter/blueprints/blueprint_for_hpc/m/mediagallery/2044347 3. 2. ETL: The Silent Killer of Big Data Projects. insideBIGDATA. [Online] https://insidebigdata.com/2015/07/23/etl-the-silent-killer-of-big-data-projects/. 3. Dell EMC PowerEdge Servers. [Online] https://www.dellemc.com/en-us/servers/index.htm. 4. Dell EMC Ready Bundles for HPC Storage. [Online] https://si.cdn.dell.