DELL EMC HPC System for Life Sciences v1.
Revisions Date Description February 2017 Initial release THIS WHITE PAPER IS FOR INFORMATIONAL PURPOSES ONLY, AND MAY CONTAIN TYPOGRAPHICAL ERRORS AND TECHNICAL INACCURACIES. THE CONTENT IS PROVIDED AS IS, WITHOUT EXPRESS OR IMPLIED WARRANTIES OF ANY KIND. Copyright © <2016> - <2017> Dell Inc. All rights reserved. Dell and the Dell EMC logo are trademarks of Dell Inc. in the United States and/or other jurisdictions.
Table of contents Revisions.............................................................................................................................................................................2 Executive summary.............................................................................................................................................................4 Audience .............................................................................................................................
Executive summary In October 2015, Dell Technologies introduced Genomic Data Analysis Platform (GDAP) v2.0 to answer the growing necessities of rapid genomic analysis due to the availability of next-generation sequencing technologies. Upon the successful implementation of GDAP v2.0, which is capable of processing up to 133 genomes per day while consuming 2 kilowatt-hour (kWh) per genome, we started to explore the life science domains beyond genomics.
1 Introduction The Dell EMC HPC System for Life Sciences is a pre-integrated, tested, tuned and purpose-built platform, leveraging the most relevant of Dell EMC’s high performance computing line of products and best-in-class partner products due to the high diversity in life sciences applications.
2 System Design The first step in designing the system is to decide upon the following four basic considerations: • • • • Type of workload o Genomics/NGS data analysis only o General purpose and Genomics/NGS data analysis o Adding molecular dynamics simulation capacity Parameter for sizing o Number of compute nodes o Genomes per day to be analyzed Form factor of servers o 2U shared infrastructure of high density that can host 4 compute nodes in one chassis (C6320) o 2U shared infrastructure of very hig
The master node controls the OS imaging and administration of the cluster. One master node is default, and high availability of the master node is optional. The configuration of Power Edge R430 which is the recommended server for a master node is provided below: • • • • • • • • 2.1.
• • • • 8 x 32GB RDIMM, 2133MT/s, Dual Rank 200GB Solid State Drive uSATA Mix Use Slim MLC 6Gbps 1.8inHot-plug Drive Interconnect: Mellanox ConnectX-3 FDR Mezzanine adapter iDRAC8 Enterprise PowerEdge C6320p - 1U, Half-width • Intel Xeon Phi 7230. 64 cores, 1.3 GHz • 96 GB at 2400 MT/s • Internal 1.8” 240GB Solid State Drive 6Gbps 2.1.
PowerEdge C4130 - 1U - Up to 4 accelerators per node • • • • • • 2.2 PowerEdge C4130, 2-socket server with Intel Xeon E5-2690 v4 processors 8 x 16GB RDIMM, 2400MT/s, Dual Rank o 16 DIMM slots, DDR4 Memory o 4GB/8GB/16GB/32GB DDR4 up to 2400MT/s Up to 2 x 1.8” SATA SSD boot drives Optional 96-lane PCIe 3.
• Port 10 and 11 are used for the PDUs. Figure 1: Dell EMC Networking S3048-ON switch Note: Installation of four SFP+ to RJ45 transceivers in Dell EMC Networking S3048-ON switch ports 49-52 is required. 2.2.2 High-Speed Interconnects In high performance computing, application performance depends on the number of CPU/GPU cores, memory, interconnect, storage performance and so on. For a server to perform better, lower latency and higher bandwidth is needed for these systems to communicate with each other.
• • • Up to 7Tb/s aggregate switching capacity 19” rack mountable chassis, 1U with optional redundant power supplies and Fan units On-board SM for fabrics up to 2k nodes. Dell EMC Networking S6000 • • 1U high-density 10/40GbE ToR switch with 32 ports of 40GbE (QSFP+) or 96 ports of 10GbE and eight ports of 40GbE or 104 ports of 10GbE. Up to 2.56Tbps of switching capacity.
HA cluster will failover the storage service to the healthy server with the assistance of the two fence devices; and also ensure that the failed server does not return to life without the administrator’s knowledge or control. The test used to evaluate the NSS7.0-HA functionality and performance is shown in Figure 2. The following configuration was used. • • • • • • A 32-node HPC compute cluster (also known as “the clients”) was used to provide I/O network traffic for the test bed.
Dell EMC Ready Bundle for HPC Lustre Storage The Dell EMC Ready Bundle for HPC Lustre Storage, referred to as Dell EMC HPC Lustre Storage, is designed for academic and industry users who need to deploy a fullysupported, easy-to-use, high-throughput, scale-out and cost-effective parallel file system storage solution. The solution uses the Intel® Enterprise Edition (EE) for Lustre® software v.3.0.
Figure 4: Dell EMC Ready Bundle for HPC Lustre Storage Components Overview 2.4 Software Configuration Along with the hardware components, the solution includes the following software components: • • 2.4.1 Bright Cluster Manager® BioBuilds Bright Cluster Manager Bright Computing is a commercial software that provides comprehensive software solutions for deploying and managing HPC clusters, big data clusters and OpenStack in the data center and in the cloud.
• 15 Using BioBuilds among all the collaborators can ensure reproducibility since everyone is running the same version of the software. In short, it is a turnkey application package. DELL EMC HPC System for Life Sciences v1.
3 Sample Architectures 3.1 Case 1: PowerEdge C6320 compute subsystem with Intel® OPA fabric Figure 5: Dell EMC HPC System for Life Sciences with PowerEdge C6320 rack servers and Intel® OPA fabric 16 DELL EMC HPC System for Life Sciences v1.
3.1.1 Solution summary This solution is nearly identical to the solutions with IB EDR and 10 GbE versions, except for a few changes in the switching infrastructure and network adapters. As shown in Figure 5, this solution uses one 48U rack and requires extra deep enclosure. Bright Cluster Manager is the default tool and a proprietary software solution stack from Bright Computing.
3.2 Case 2: PowerEdge FC430 compute subsystem with IB FDR fabric Figure 6: Dell EMC HPC System for Life Sciences with PowerEdge FC430 rack servers with IB FDR fabric 18 DELL EMC HPC System for Life Sciences v1.
3.2.1 Solution summary The FC30 solution with IB FDR interconnect is nearly identical to the 10 GbE version, except for a few changes in the switching infrastructure, and network adapters and have 2:1 blocking FDR connectivity to the top of rack FDR switch. • • • • • The port assignment of the Dell EMC Networking S3048-ON switch for the Intel® OPA or IB versions of the solution is as follows.
4 Conclusion The HPC System Builder for Life Sciences provides the minimum architecture that could achieve the targeted NGS workload, informed decision making and increased efficiency. However, the configuration provided by the Dell EMC HPC System Builder for Life Sciences tool is intended to be used as a starting point only. Dell EMC suggests that you contact your technical sales representative to review this quote for completeness and to include other variables not included as input to the tool.