White Papers

Ready Solutions Engineering Test Results
Variant Calling (BWA-GATK) pipeline benchmark
with Dell EMC Ready Bundle for HPC Life Sciences
13G/14G server performance comparisons with Dell EMC Isilon and Lustre
Storage
Overview
Variant calling is a process by which we identify variants from sequence data. This process includes making a decision if there is single
nucleotide polymorphisms (SNPs)
i
, insertions and deletions (indels)
ii
, and/or structural variants (SVs)
iii
at a given position in an
individual genome or transcriptome. The main goal of identifying genomic variations is linking to human diseases. Although not all
human diseases are associated with genetic variations, variant calling can provide a valuable guideline for geneticists working on a
particular disease caused by genetic variations. BWA-GATK is one of the Next Generation Sequencing (NGS) computational tools that
is designed to identify germline and somatic mutations from human NGS data. There are a handful of variant identification tools, and we
understand that there is not a single tool performs perfectly (Pabinger et al., 2014). However, we chose GATK which is one of most
popular tools as our benchmarking tool to demonstrate how well Dell EMC Ready Bundle for HPC Life Sciences can process complex
and massive NGS workloads.
The purpose of this blog is to provide valuable performance information on the Intel® Xeon
®
Gold 6148 processor and the previous
generation Xeon
®
E5-2697 v4 processor using a BWA-GATK pipeline benchmark on Dell EMC Isilon F800/H600 and Dell EMC Lustre
Storage. The Xeon
®
Gold 6148 CPU features 20 physical cores or 40 logical cores when utilizing hyper threading. This processor is
based on Intel’s new micro-architecture codenamed “Skylake”. Intel significantly increased the L2 cache per core from 256 KB on
Broadwell to 1 MB on Skylake. The 6148 also touts 27.5 MB of L3 cache and a six channel DDR4 memory interface. The test cluster
configurations are summarized in Table 1.
Table 1 Test Cluster Configurations
Dell EMC PowerEdge C6420 Dell EMC PowerEdge C6320
CPU
2x Xeon
®
Gold 6148 20c 2.4GHz (Skylake) 2x Xeon
®
E5-2697 v4 18c 2.3GHz (Broadwell)
RAM
12x 16GB @2666 MHz 8x 16GB @2400 MHz
OS
RHEL 7.3
Interconnect
Intel
®
Omni-Path
BIOS System Profile
Performance Optimized
Logical Processo
r
Disabled
V
irtualization Technology
Disabled
BW
A
0.7.15-r1140
Sambamba
0.6.5
Samtools
1.3.1
GATK
3.6

Summary of content (5 pages)