White Papers

3 Retail Analytics with Malong RetailAI® on DELL EMC PowerEdge servers

Overview of Deep Learning

Deep learning consists of two phases: Training and inference. As illustrated in Figure 1, training involves

learning a neural network model from a given training dataset over a certain number of training

iterations and loss function [1]. The output of this phase, the learned model, is then used in the

inference phase to speculate on new data.

The major difference between training and inference is training employs forward propagation and

backward propagation (two classes of the deep learning process) whereas inference mostly consists of

forward propagation [2]. To generate models with good accuracy, the training phase involves several

training iterations and substantial training data samples, thus requiring many-core CPUs or GPUs to

accelerate performance.

Figure 1. Deep Learning phases.

Deep Learning Inferencing

After a model is trained, the generated model may be deployed (forward propagation only) e.g., on

FPGAs, CPUs or GPUs to perform a specific business-logic function or task such as identification,

classification, recognition and segmentation [Figure 2].

The focus of this blog will be on the power of Dell EMC PowerEdge R7425 using NVIDIA T4-16GB GPUs to

accelerate image classification and deliver high-performance inference throughput and low latency using

various implementations of NVIDIA TensorRT™ an excellent tool to speed up inference and Malong

RetailAI® software stack.