White Papers

Dell - Internal Use - Confidential
Figure 3: The training speed and time of GoogleNet in MXNet using P100 GPUs
Figure 4: The training speed and time of Inception-BN in MXNet using P100 GPUs
Figure 5 shows the training speed and training time of Inception-V3 neural network in TensorFlow using
P100 GPUs. Similar to NV-Caffe and MXNet, TensorFlow also showed good scalability in training speed
when more P100 GPUs were used. The training with TensorFlow on multiple nodes was able to run but
with poor performance. So that result was not shown here and the reason is still under investigation.
Figure 5: The training speed and time of Inception-V3 in TensorFlow using P100 GPUs
189
373
743
1513
2270
2771
6776
3433
1726
847
566
463
0
500
1000
1500
2000
2500
3000
0
1000
2000
3000
4000
5000
6000
7000
8000
1 P100 2 P100 4 P100 8 P100 12 P100 16 P100
Images/sec (higher the better)
Seconds (lower the better)
MXNet Inception-BN ILSVRC12
Training Speed Training Time
75
132
220
3592
2059
1279
0
50
100
150
200
250
0
500
1000
1500
2000
2500
3000
3500
4000
1 P100 2 P100 4 P100
Images/sec (higher the better)
Seconds (lower the better)
TensorFlow Inception-V3 ILSVRC12
Training Speed Training Time