White Papers

Dell - Internal Use - Confidential
New NVIDIA V100 32GB GPUs, Initial performance
results
Deepthi Cherlopalle, HPC and AI Innovation Lab. June 2018
GPUs are useful for accelerating large matrix operations, analytics, deep learning workloads and
several other use cases. NVIDIA introduced the Pascal line of their Tesla GPUs in 2016, the Volta line of
GPUs in 2017, and recently announced their latest Tesla GPU based on the Volta architecture with 32GB
of GPU memory. The V100 GPU is available with both PCIe and NVLink version, allowing GPU-to-GPU
communication over PCIe or over NVLink. The NVLink version of the GPU is also called an SXM2 module.
This blog will give an introduction to the new Volta V100-32GB GPUs and compare the HPL performance
between different V100 models. Tests were performed using a Dell EMC PowerEdge C4140 with both PCIe
and SXM2 configurations. There are several other platforms which support GPUs: PowerEdge R740,
PowerEdge R740XD, PowerEdge R840, and PowerEdge R940xa. A similar study was conducted in the past
comparing the performance of the P100 and V100 GPUs with the HPL, HPCG, AMBER, and LAMMPS
applications.
Table 1 below provides an overview of Volta device specifications.
Table 1: GPU Specifications
Tesla V100-PCIe
Tesla V100-SXM2
GPU Architecture
Volta
NVIDIA Tensor cores
640
NVIDIA CUDA Cores
5140
GPU Max Clock Rate
1380MHz
1530MHz
Double precision
performance
7TFlops
7.8TFlops
Single precision
performance
14TFlops
15.7TFlops
GPU memory
16/32GB
16/32GB
Interconnect
Bandwidth
32GB/s
300GB/s
System Interface
PCIe Gen3
NVIDIA NVLink
Max Power
Consumption
250 watts
300 watts

Summary of content (7 pages)