White Papers

Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities
7
Introduction 1.
This Solution Guide provides information on the latest Dell NFS Storage Solution high availability
configurations (NSS-HA). The NSS-HA uses the NFS file system along with the Red Hat Scalable File
system (XFS) and Dell PowerVault storage to provide an easy to manage, reliable and cost effective
storage solution for HPC clusters. With this latest offering, NSS-HA configurations are now available in
configurations greater than a 100 Terabytes.
The philosophy and design principles for this release remain the same as previous Dell NSS-HA
configurations. Hence this version of the solution guide primarily describes the deltas in configuration
and performance. For complete details, review this document along with the previous version titled
Dell HPC NFS Storage Solution High Availability Configurations, Version 1.1.”. The main changes in this
version of the solution include support for larger capacities. Also presented is the associated
performance characterization and updated software versions.
The following sections describe the technical details, evaluation method and the expected
performance of the solution. An extensive appendix provides a complete set of instructions on the
configuration steps and tuning parameters required to deploy such a solution.
NSS-HA solution review
2.
The design of this version of the NSS-HA solution is similar to previous versions. This section provides a
quick review of the NSS-HA solution. Complete details are available in the document “Dell HPC NFS
Storage Solution High Availability Configurations, Version 1.1”. This section can be skipped for readers
who are already familiar with the NSS-HA architecture.
Figure 1 depicts the general overview of the NSS-HA solution. The core of the solution is a high
availability (HA) cluster, which provides a highly reliable and available storage service to the HPC
compute cluster via a high performance network connection such as InfiniBand (IB) or 10 Gigabit
Ethernet (10GbE). The HA cluster has shared access to disk-based Dell PowerVault storage in a variety
of capacities.
The HA cluster consists of several components as listed below:
High Availability nodes - These are servers configured with the Red Hat Enterprise Linux high
availability cluster software stack. In the NSS-HA solution, two systems are deployed as a pair
of NFS severs; they are configured in an active/passive mode, and have direct access to the
shared storage stack.
Network switch for the HA cluster (or the private network) The private network is used for
communication between the HA cluster nodes and other cluster hardware such as network
power switches and the fence devices which are installed in the cluster nodes.
Fence devices Fence devices are required for fencing (rebooting) the failed or misbehaving
cluster node in the HA cluster. In the NSS-HA solution, two types of fence devices are
configured: Switched Power Distribution Units (PDU) and the Dell server management
controller, the iDRAC.