White Papers

8 CheXNet – Inference with Nvidia T4 on Dell EMC PowerEdge R7425 | Document ID

2 Test Methodology

In this project we ran image classification inference for the custom model CheXNet on the

PowerEdge R7425 server in different precision modes and software configurations: with

TensorFlow-CPU support only, TensorFlow-GPU support, TensorFlow with TensorRT™,

and native TensorRT™. Using different settings, we were able to compare the throughput

and latency and expose the capacity of PowerEdge R7425 server when running inference

with Nvidia TensorRT™. See Figure 4

Figure 4:Test Methodology for Inference

2.1 Test Design

The workflow pipeline started with the training of the custom model from scratch until running

the optimized inference graphs in multi-precision modes and configurations. To do so, we

followed the below the steps:

a) Building the CheXNet model with TensorFlow, transfer learning & estimator.

b) Training the Model for Inference

c) Saving Trained Model with TensorFlow Serving for Inference

d) Freezing the Saved Model

e) Running the Inference with Native TensorFlow CPU Only

f) Running the Inference with Native TensorFlow GPU Support

g) Converting the Custom Model to Run Inference with TensorRT™

h) Running Inference using TensorFlow-TensorRT (TF-TRT) Integration

i) Running Inference using TensorRT™ C++ API

j) Comparing Inferences in multi-mode configurations