Reference Guide

13 Dell EMC Ready Solutions for AI Deep Learning with NVIDIA | v1.0
The third switch in the solution is called a gateway switch in Figure 2 and connects the Isilon F800 to the head
nal interfaces are 40 Gigabit Ethernet. Hence, a switch which can serve
as the gateway between the 40GbE Ethernet and InfiniBand networks is needed for connectivity to the head
and compute nodes. The Mellanox SX6036 is used for this purpose. The gateway is connected to the InfiniBand
EDR switch and the Isilon as shown in Figure 2.
2.7 Software
The software portion of the solution is provided by Dell EMC and Bright Computing. The software includes
several pieces.
The first piece is Bright Cluster Manager which is used to easily deploy and manage the clustered infrastructure
and provides all cluster software including the operating system, GPU drivers and libraries, InfiniBand drivers
and libraries, MPI middleware, the Slurm schedule, etc.
The second piece is the Bright machine learning (ML) which includes any deep learning library dependencies
to the base operating system, deep learning frameworks including Caffe/Caffe2, Pytorch, Torch7, Theano,
Tensorflow, Horovod, Keras, DIGITS, CNTK and MXNet, and deep learning libraries including cuDNN, NCCL,
and the CUDA toolkit.
The third piece is the Data Scientist Portal which was developed by Dell EMC. The portal was created to abstract
the complexity of the deep learning ecosystems by providing a single pane of glass which provides users with
an interface to get started with their models. The portal includes spawner for Jupyterhub and integrates with
Resource managers and schedulers (Slurm)
LDAP for user management
Deep Learning framework environments (Tensor Flow, Keras, MXNet, Pytorch etc.
module environment, Python2, Python3 and R kernel support
Tensorboard
Terminal CLI environments.
It also provides templates to get started with for various DL environments and adds support for singularity
containers. For more details about how to use the Data Scientist Portal, refer to Section 5.