Dell HPC NFS Storage Solution – High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers A Dell Technical White Paper Xin Chen Dell HPC Engineering November 2014| Version 1.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers This document is for informational purposes only and may contain typographical errors and technical inaccuracies. The content is provided as is, without express or implied warranties of any kind. © 2014 Dell Inc. All rights reserved. Dell and its affiliates cannot be responsible for errors or omissions in typography or photography.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers Contents Executive summary ....................................................................................................... 5 1. Introduction ........................................................................................................... 6 2. Overview of NSS-HA solutions ..................................................................................... 6 2.1.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers Figure 4. iv IPoIB random write and read performance ..........................................................
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers Executive summary This white paper describes the Dell NFS Storage Solution - High Availability configurations (NSS6.0-HA) with Dell PowerEdge 13th generation servers. It presents a comparison among all available NSS-HA offerings so far, and provides performance results for a configuration with a storage system providing 480TB of raw capacity.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers 1. Introduction This white paper provides information on the latest Dell NFS Storage Solution - High Availability configurations with Dell PowerEdge 13th generation servers. The solution uses Dell PowerEdge servers and PowerVault storage arrays along with Red Hat High Availability software stack to provide an easy to manage, reliable, and cost effective storage solution for HPC clusters.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers failover the storage service to the healthy server with the assistance of the two fence devices; and also ensure that the failed server does not return to life without the administrator’s knowledge or control. The disk-based storage array is formatted as a Red Hat Scalable file system (XFS) and exported to the HPC cluster via NFS service of the HA cluster.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers 2.2. NSS-HA offerings from Dell Table 1 lists the available Dell NSS-HA solutions with standard configurations. Table 1. NSS-HA Solutions(1), (2), (3), (4), (5), (6) NSS5.5-HA Release (April 2014) “PowerVault MD3460 based solution” NSS6.0-HA Release (November 2014) “PowerEdge 13th generation server based solution” Storage Capacity 180TB to 360TB of raw storage capacity.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers 3. Dell PowerVault MD3460 and MD3060e storage arrays As compared to previous versions of the NSS-HA solution, a major change in the current version is the introduction of 4TB disks. In the previouse NSS-HA solutions(3), (4), (5), (6), 3TB disks were used. The PowerVault MD3460 and MD3060e storage arrays were used in NSS5.5-HA(6), and they are still being used in NSS6.0-HA.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers 4. Evaluation The architecture proposed in this white paper was evaluated in the Dell HPC lab. This section describes the test methodology and the test bed used for verification. It also contains details on the functionality tests. Performance tests and results follow in Section 5. 4.1.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers Figure 2. NSS6.0-HA test bed Public network (IB or 10GbE) Clients Clients Private network R630 R630 1 MD3460 + 1 MD3060e PDU PDU NSS6.0-HA 480TB configuration Table 3. Public network Private network Power Storage connections NSS6.0-HA hardware configuration Server configuration NFS server model Two Dell PowerEdge R630s. Processor Dual Intel Xeon E5-2697 v3 @ 2.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers Systems Management iDRAC8 Enterprise. Power Supply Dual Power Supply Units. Storage configuration Storage Enclosure One Dell PowerVault MD3460 array and one MD3060e array for the 480TB solution. RAID controllers Duplex RAID controllers in the Dell MD3460. Hard Disk Drives 60 - 4TB 7200 rpm NL SAS drives per array.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers Table 5. NSS6.0-HA client cluster configuration Client / HPC Compute Cluster Clients 64 PowerEdge M420 blade servers 32 blades in each of two PowerEdge M1000e chassis Red Hat Enterprise Linux 6.4 x86-64.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers Note: The analysis below assumes that the HA cluster service is running on the active server; the passive server is the other component of the cluster. Table 6. NSS-HA mechanisms to handle failures Failure type Mechanism to handle failure Single local disk failure on a server Operating system installed on a two-disk RAID 1 device with one hot spare.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers Fence device failure Single SAS link failure Multiple SAS link failures The NSS-HA behaviors in response to these failures are outlined below. Server failure — simulated by introducing a kernel panic. When the active server fails, the heartbeat between the two servers is interrupted.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers Impact to clients Clients mount the NFS file system exported by the server using the HA service IP. This IP is associated with either an IPoIB or a 10 Gigabit Ethernet network interface on the NFS server. To measure any impact on the client, the dd utility and the IOzone benchmark were used to read and write large files between the clients and the file system.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers Leveraging the latest PowerEdge R630 server and RHEL 7.0, significant sequential I/O performance improvements were observed during our tests: The peak read performance of NSS6.0-HA was up to 6.07 GB/sec; and there were on average around 75% improvement as compared to the read performance of NSS5.5-HA(6). The peak write performance of NSS6.-HA was up to 2.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers Figure 4. IPoIB random write and read performance NFS IB random I/O performance 9000 8000 7000 IOPS 6000 5000 4000 3000 2000 1000 0 1 2 4 8 16 32 48 64 Number of concurrent clients Write Read 6.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers 7. References 1. Dell HPC NFS Storage Solution High Availability Configurations, Version 1.1 http://i.dell.com/sites/content/business/solutions/whitepapers/en/Documents/dell-hpc-nsshasg.pdf 2. Dell HPC NFS Storage Solution — High availability with large capacities, Version 2.1 http://i.dell.com/sites/content/business/solutions/engineering-docs/en/Documents/hpc-nfsstorage-solution.pdf 3.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers Appendix A: Benchmarks and test tools The IOzone benchmark tool was used to measure sequential read and write throughput (MB/sec) as well as random read and write I/O operations per second (IOPS). The checkstream utility was used to test for data correctness under failure and failover cases.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers IOzone Argument Description -t Number of threads. +m Location of clients to run IOzone when in clustered mode. -w Does not unlink (delete) temporary file. -I Use O_DIRECT, bypass client cache. -O Give results in ops/sec.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers A.2. Checkstream The checkstream utility is available at http://sourceforge.net/projects/checkstream/. Version 1.0 was installed and compiled on the NFS servers and used for these tests. First, a large file was created using the genstream utility. This file was copied to and from the NFS share by each client using dd to mimic write and read operations.
Dell HPC NFS Storage Solution - High Availability (NSS6.0-HA) Configuration with Dell PowerEdge 13th Generation Servers # dd if=/dev/zero of=/mnt/xfs/file bs=1M count=90000 To read data from the storage, the following command line was used.