Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities High availability NSS configurations for capacities greater than 100TB. Xin Chen, Garima Kochhar and Mario Gallegos Dell HPC Engineering Version 2.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities This document is for informational purposes only and may contain typographical errors and technical inaccuracies. The content is provided as is, without express or implied warranties of any kind. © 2012 Dell Inc. All rights reserved. Dell and its affiliates cannot be responsible for errors or omissions in typography or photography. Dell, the Dell logo, and PowerEdge are trademarks of Dell Inc.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Contents Executive summary (Updated May 2012) .............................................................................. 6 1. Introduction ....................................................................................................... 7 2. NSS-HA solution review .......................................................................................... 7 2.1. 3. Availability in NSS-HA ......................
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.3.2. Configure Multipath................................................................................... 46 A.3.3. Install Mellanox OFED package and set network IPs ............................................ 48 A.3.4. Install Operating system and storage management tools ...................................... 49 A.3.5. Network security setting ...................................................................
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Figures Figure 1. Overview of the NSS-HA solution ........................................................................ 8 Figure 2. A failure scenario in NSS-HA ............................................................................. 9 Figure 3. NSS-HA architectural diagram ......................................................................... 10 Figure 4. NFS server configuration in NSS-HA ..............
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Executive summary (Updated May 2012) This solution guide describes the large capacity configurations of the Dell HPC NFS Storage Solution with high availability support (NSS-HA). It presents an architecture overview, and provides tuning best practices and performance details for configurations with capacities of 144TB and 288TB. These configurations break the 100TB limit of previous supported configurations.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 1. Introduction This Solution Guide provides information on the latest Dell NFS Storage Solution high availability configurations (NSS-HA). The NSS-HA uses the NFS file system along with the Red Hat Scalable File system (XFS) and Dell PowerVault storage to provide an easy to manage, reliable and cost effective storage solution for HPC clusters.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Figure 1.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 2.1. Availability in NSS-HA A major goal of the NSS-HA solution is to improve storage service availability in the presence of possible failures or faults. This goal is achieved by a “failover” process implemented by Red Hat Enterprise High Availability Cluster software stack. Figure 2 shows a typical scenario of how storage service availability is guaranteed in the NSS-HA solution.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Appendix A: NSS-HA Recipe includes detailed instructions on the configuration steps. 3. NSS-HA architecture Figure 3 presents the architectural diagram of the NSS-HA solution. A pair of PowerEdge R710 servers are configured as an active-passive HA pair and function as an NFS gateway for the HPC compute cluster (also called the clients). Figure 3.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Figure 4. NFS server configuration in NSS-HA The NSS-HA architecture is discussed in detail in the previous version of this solution guide (4) . 3.1. Storage in NSS-HA The NSS-HA is a storage solution, in which a shared storage array is directly connected to the HA cluster nodes, as shown in Figure 1 and Figure 3. Access to the storage is provided to users via the HA service defined in the HA cluster.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 2) Step 2 – Logical volume configuration In this step, a logical volume is created to access the capacity configured on the storage arrays. NSS-HA requires a simple way to manage and scale a storage stack and so Linux logical volume manager is used for its simplicity. In order to create a logical volume, physical volumes (PVs) are created first.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 3.2. Potential failures and fault tolerant mechanisms in NSS-HA In the real world, there are many different types of failures and faults which can impact the functionality of NSS-HA. Table 1 lists the potential failures which can be tolerated in an NSS-HA solution based on the architecture described in Section 3.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 4. New components and updates from previous versions of the solution This section provides information on the updates in this version of the NSS-HA solution when compared to the previous version (4). The current version includes several major changes and updates to various components of the solution. 4.1. Storage density In previous versions of NSS-HA, each storage enclosure was equipped with 12 3.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 4.2. Storage configuration In previous versions of the solution, the file system had a maximum of four virtual disks. A Linux physical volume was created on each virtual disk. The physical volumes were grouped together into a Linux volume group and a Linux logical volume was created on the volume group. The XFS file system was created on this logical volume. Figure 6.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities release, do not refer to the instructions listed in the previous RHEL 5.5 NSS-HA solution guide (4) when configuring HA on RHEL 6.1 based clusters. 4.4. Red Hat scalable file system package In previous versions of NSS-HA, the version of XFS is 2.10.2-7 which is distributed with RHEL 5.5. In the current version of NSS-HA, the version of XFS used is 3.1.1-4 and is distributed with RHEL 6.1.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Table 4. New server components in this release Server components Previous release(4) NFS server PowerEdge R710 Memory per NFS server 48 GB 96 GB More memory to improve performance where a large cache is useful. Also to manage possible XFS repair operations on the larger capacity file system. Operating System RHEL 5.5 x86_64 RHEL 6.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 5.4 describes these functionality tests and their results. Functionality testing was similar to work done in the previous versions of the solution (4). A 64 node HPC cluster was used to provide I/O workload to test the performance of the NSS-HA. The performances of the 144TB and 288TB solutions were measured against this test bed for both InfiniBand and Ethernet based clients.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Figure 7.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Table 5. NSS-HA hardware configuration details Server configuration NFS server model Two PowerEdge R710 Processor Dual Intel Xeon E5630 @ 2.53GHz Memory 12 * 4GB 1333MHz RDIMMs (The test bed used 48GB; the recommendation for production clusters is to use 96GB).
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Table 6. NSS-HA software configuration details SOFTWARE Operating system Red Hat Enterprise Linux (RHEL) 6.1 x86_64 Kernel version 2.6.32-131.0.15.el6 x86_64 Cluster Suite Red Hat Cluster Suite from RHEL 6.1 File system Red Hat Scalable File System (XFS) 3.1.1-4 Systems Management Dell OpenManage Server Administrator 6.5.0 Storage Management Dell Modular Disk Storage Manager 3.0.0.18 Table 7.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Table 8. NSS-HA client configuration details Client / HPC Compute Cluster Clients 64 PowerEdge R410 compute nodes Red Hat Enterprise Linux 6.1 x86-64 InfiniBand Mellanox ConnectX-2 QDR HCA Mellanox OFED 1.5.3-3.0.0 InfiniBand fabric All clients connected to a single large port count InfiniBand switch (Mellanox IS5100). Both R710 NSS-HA servers also connected to the InfiniBand switch.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities NFS Server configuration 7) The XFS file system is mounted with the wsync option. 8) The XFS file system is exported using the NFS sync option. 9) Number of concurrent NFS threads is increased from a default of 8 to 256 on the NFS servers. 10) The default OS scheduler is changed from cfq to deadline. 11) MTU is set to 9000 on the 10 Gigabit Ethernet networks.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 2) Heartbeat link failure - simulated by disconnecting the private network link on the active server. When the heartbeat link is removed from the active server, both servers detect the missing heartbeat and attempt to fence each other. The active server is unable to fence the passive since the missing link prevents it from communicating over the private network.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Impact to clients Clients mount the NFS file system exported by the server using the HA service IP. This IP is associated with either an InfiniBand or a 10 Gigabit Ethernet network interface on the NFS server. To measure any impact on the client, the dd utility and the iozone benchmark were used to read and write large files between the client and the file system.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Metadata tests were performed using the mdtest benchmark and include file stat, create and delete operations. While these benchmarks do not cover every I/O pattern, they help characterize the I/O performance of the NSS-HA solution. As mentioned in Section 5.3 bullet (12), performance was evaluated using NFSv3 as well as NFSv4.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities The sequential read performance as shown in Figure 9 peaks at ~2000MB/s. Figure 9. InfiniBand large sequential read performance InfiniBand large sequential read performance 2500 Throughput in MB/s 2000 1500 1000 500 0 1 2 4 8 16 32 48 64 Number of concurrent clients 144 TB -- NFSv3 288 TB -- NFSv3 6.2.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Figure 10. 10GbE large sequential write performance 10GbE Large Sequential Write Performance 1400 Throughput: MB/sec 1200 1000 800 600 400 200 0 1 2 4 8 16 32 48 64 Number of concurrent clients 144 TB -- NFSv3 288 TB -- NFSv3 Figure 11.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A single NFS client with a 10 Gigabit Ethernet to the storage solution was also tested. In this case the single 10GbE client and the NFS server were both directly connected to the PowerConnect 8024 10GbE switch. Results for this test using the NFSv3 protocol are shown in Figure 12. From the graph it is seen that write throughput is ~880MB/s and read throughput is ~1000MB/s for 144TB configuration.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Figure 13. InfiniBand random write performance IB Random Write Performance 4500 4000 3500 IOPS 3000 2500 2000 1500 1000 500 0 1 2 4 8 16 32 48 64 Number of concurrent clients 144 TB -- NFSv3 288 TB -- NFSv3 The NSS-HA write performance is limited by several design factors including write cache mirroring on the RAID controllers, XFS wsync mount option and NFS sync export option.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Figure 14. InfiniBand random read performance IB Random Read Performance 12000 10000 IOPS 8000 6000 4000 2000 0 1 2 4 8 16 32 48 64 Number of concurrent clients 144 TB -- NFSv3 288 TB -- NFSv3 6.4. Metadata tests From past experience, it was expected that metadata test results would be very similar for 10GbE and IPoIB; however that was not observed in this environment.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities expected the file create and file remove performance is similar, since both involve write-type operations. Figure 15.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Figure 16. InfiniBand file stat performance IB File Stat Performance 160000 Number of stat() per sec 140000 120000 100000 80000 60000 40000 20000 0 1 2 4 8 16 32 48 64 128 256 512 Number of concurrent clients 144 TB -- NFSv3 288 TB -- NFSv3 Figure 17.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 6.5. NFSv3 compared to NFSv4 During the design and analysis of the NSS-HA solution it was found that NFSv3 provides better performance than NFSv4 for certain scenarios. This section describes the deltas in performance between NFSv3 and NFSv4. In certain situations the security enhancements of NFSv4 might be more important than any loss in performance.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities With IPoIB, sequential reads were found to be up to 22% better with NFSv4 while sequential write throughput is was up to 22% worse with NFSv4.This is shown in Figure 19 for a 288TB configuration. The 144TB configuration shows a similar trend. Figure 19. InfiniBand NFSv3 and NFSv4 sequential performance IB large sequential performance - v3 vs.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities operations, NFSv3 versus NFsv4 performance depended on the number of concurrent clients in the test. For single client runs NFSv3 was better by 9%. With 4 and 16 clients NFSv4 was substantially better than NFSv3 by up to 35%. For all other cases, NFSv4 was marginally better. The 144TB configuration showed a similar trend. Due to time constraints, 10GbE performance with NFSv4 was not measured in time for this publication.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 8. References 1) XFS: A high-performance journaling file system. http://oss.sgi.com/projects/xfs/ 2) Quick SAS Cabling Guide --A Dell Technical White Paper. http://www.dell.com/downloads/global/products/pvaul/en/powervault-md3200-m3200i-cablingguide.pdf 3) Red Hat Enterprise Linux 6 Cluster Administration -- Configuring and Managing the High Availability Add-On. http://docs.redhat.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Appendix A: NSS-HA Recipe (Updated May 2012) Contents A.1. Pre-install preparation ..................................................................................... 39 A.1.1. NSS-HA cluster specification ........................................................................ 40 A.1.2. Checklist ................................................................................................ 42 A.2.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.1. Pre-install preparation The following figure shows an NSS HA cluster.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.1.1.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities PDU 2 Login name: apc password: apc port 2 for active, port 3 for passive Note: Instructions will be provided in section A.3.7 to configure iDRAC and APC PDU.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities nssha61_single.py Configuring storage devices, including creating/removing PVs, VGs, LVs, and XFS file system. sas_path_check.sh Used by HA cluster management tool to monitor the status of SAS paths. ibstat_script.sh If IPoIB is deployed, the script is used by HA cluster management tool to monitor IB link status. Note: all scripts are attached with this document and can be found in section A.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.2. Server hardware setup 1. Prepare two PowerEdge R710 servers (called “active” and “passive”). Configure each server as follows. o o o o o o o One PERC H700 and 5 local disks each of 146 GB. Configure 2 disks in RAID 1 with 1 additional disk designated as the hot spare. This will be used for the operating system. Configure 2 disks in RAID 0, this will be used as swap.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.2.1.Checklist Before moving to the next section, please make sure all following tasks are completed. Tasks Install hard disks, SAS cards, 10gbE card or IB card, and iDRAC enterprise on each R710. Configure local disks on each R710. Connect all PDUs, iDRACs, and two R710s to a Gigabit switch. Connect the switch and two R710s to the two PDUs. Connect each R710 to the public network via 10gbE or IB links.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.3. Server software configuration All the operations below apply to each R710. A.3.1.Install RHEL 6.1, configure swap disks, and install XFS packages. 1. Install the RHEL6.1 x86_64 operating system (kernel version 2.6.32-131.0.15.el6.x86_64) on the RAID1 virtual disk. o o Make sure MD storage is not attached to the servers during the OS installation.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.3.2.Configure Multipath On each R710, exclude the local disks from the control of multipathd. 1. Make sure the multipath software is installed. To verify if the multipath software is installed: # rpm –a | grep multipath device-mapper-multipath-0.4.9-41.el6.x86_64 device-mapper-multipath-libs-0.4.9-41.el6.x86_64 If the packages are not present, install them and then run the following command to create the /etc/multipath.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Then, edit the file /etc/multipath.conf, search for the blacklist section and uncomment it or modify it if already uncommented. Add the “wwid” entries for the local virtual disks to make it look like the following example: blacklist { wwid wwid "36842b2b0723e980017184fb50ae00d6a" "36842b2b0723e980017184fcf0c6c56e9" … {Rest of your original blacklist, if any} } c.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.3.3.Install Mellanox OFED package and set network IPs 1. Install Mellanox OFED 1.5.3-3.0.0 if using InfiniBand (MLNX_OFED_LINUX-1.5.3-3.0.0-rhel6.1x86_64.iso). Note: If 10GbE network is deployed, please skip this step. You may need to install the dependencies: glibc-devel-2.12-1.25.el6.i686.rpm tcl-8.5.7-6.el6.x86_64.rpm tk-8.5.7-5.el6.x86_64.rpm 2.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.3.4.Install Operating system and storage management tools 1. Install Dell OpenManage Server Administrator (http://downloads.dell.com/sysman/OMSrvAdmin-Dell-Web-LX-6.5.0-2247.RHEL6.x86_64_A01.5.tar.gz) If needed install the included security key using the command: # rpm --import RPM-GPG-KEY If the setup fails citing missing dependencies, install the missing rpms from the RHEL 6.1 DVD libcmpiCppImpl0-2.0.1-5.el6.x86_64.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.3.5.Network security setting In this step ports will be enabled on both servers. The list of cluster ports to be enabled is in the Red Hat Cluster Administration Guide, section 2.3. http://docs.redhat.com/docs/enUS/Red_Hat_Enterprise_Linux/6/pdf/Cluster_Administration/Red_Hat_Enterprise_Linux-6Cluster_Administration-en-US.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities LOCKD_TCPPORT=890 LOCKD_UDPPORT=890 MOUNTD_PORT=892 STATD_PORT=12025 At this point the servers are ready to accept NFSv3 traffic through the firewall. Alternately, turn off the firewall.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 2. Configure two APC PDUs, and make sure the login name, password, and IP address are configured according to the NSS-HA cluster specification in section A.1.1.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.3.8.Checklist Before moving to the next section, please make sure all following tasks are completed. Tasks Install OS, configure swap disks, and install XFS packages Configure multipath Install Mellanox package and set network IPs. Install OSMA and storage management tool Network security setting Configure startup service Configure and test the configuration of PDUs and iDRAC.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.4. Performance tuning on the server All the operations below apply to each R710. 1. If the clients access the NFS server via 10GbE, configure the MTU on the 10GbE device to be 8192 for both the active and the passive server. Note that the switches need to be configured to support large MTU as well.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Restart the NFS service, service nfs restart Run the nfsstat command. It should show output only for v3 and not for v4. Another option is make the change on the clients. On each of the clients, mount the NFS share using the option “-o vers=3”. This will use NFSv3 for the clients. If the server supports NFSv4, the default mount option without an explicit vers=3 parameter will be NFSv4. A.4.1.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.5. Storage hardware setup 1. Cable the MD3200 to the SAS 6 Gbps card on servers as shown in the figure below. Each server has two dual-port SAS cards. Cable one port on each SAS card to the storage. That is, each server will have one cable per SAS card going to the MD3200. Reference: Dell PowerVault MD3200 and MD3220 Storage Arrays Deployment Guide, http://support.dell.com/support/edocs/systems/md3200/en/DG/PDF/DG.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Cabling for 288TB configuration R710 SAS HBA Port 0 Port 1 RAID Controller 1 RAID Controller 2 R710 SAS HBA Port 0 SAS OUT SAS OUT Port 0 Port 1 SAS IN 1 2 SAS IN 3 0 SAS IN 1 2 SAS IN 3 SAS OUT EMM 2 SAS IN SAS OUT EMM 1 SAS IN SAS OUT EMM 2 SAS IN SAS OUT EMM 1 SAS IN SAS OUT EMM 2 EMM 1 EMM 2 EMM 1 EMM 2 EMM 1 EMM 2 Port 1 MD3200 SAS IN EMM 1 Port 0 0 EMM 1 EMM 2 SAS HB
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Note: Do not cable storage to R710 before OS install, and use the leftmost port of each SAS card on the two R710s. 2. To cable MD1200s to MD3200, please refer to the figure above. Reference: Quick SAS Cabling Guide --A Dell Technical White Paper. http://www.dell.com/downloads/global/products/pvaul/en/powervault-md3200-m3200i-cablingguide.pdf 3. Power on all MD3200 and MD1200s.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.6. Storage configuration 1. Launch the MDSM management GUI on one R710. Discover the attached storage array via in-band management and add the storage array to the management GUI. 2. Create a host group (named as NSS-HA61) and add the active and passive servers to the group. 3.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Make sure the host port identifiers on each R710 match the sas_address from /sys/class/sas_phy/phy-1:0/sas_address and /sys/class/sas_phy/phy-2:0/sas_address. 6. On each R710, run the command rescan_dm_devs to detect all the virtual disks. 7. On each R710, cat /proc/partitions and multipath –ll should show all the LUNs on the storage. Reference: Configuration: Device Mapper Multipath for Linux, http://support.dell.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.6.1.Checklist Before moving to the next section, please make sure all following tasks are completed. Tasks Configure the storage array Check the storage array is correctly configured Notes Section A.6, step 1, 2, 3, and 4. Section A.6, step 5, 6, 7 and 8. It is very important to make sure all the four steps are successfully completed.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.7. NSS HA cluster setup In this recipe the term “cluster” refers to the active-passive NSS-HA Red Hat cluster. A.7.1.Prepare 1. On both R710s install the cluster software packages. # # yum install -y ricci rgmanager cman openais lvm2-cluster ccs service ricci start; chkconfig ricci on 2. Set a password for user ricci using the command below # passwd ricci 3. Create a mount point for the file system.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 6. Check that the public interface is up on both servers. This is the 10GbE link or the InfiniBand link. For 10GbE, using # ethtool p2p1 | grep “Link detected” If the link is up, the output will display “Link detected: yes”. For InfiniBand, using # ibstat | grep “Physical state” If the link is up, the output will display “Physical state: LinkUp” 7. Check /etc/lvm/lvm.conf, make sure locking_type is set to 3.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 1. Generate cluster configuration file. On the „active‟ R710, manually modify the cluster configuration script according to the cluster spec. In /root/config_cluster.sh Modify: machine_active="active" machine_passive="passive" #check section A.1.1 #check section A.1.1 pdu1_ip="15.15.10.101" pdu2_ip="15.15.10.102" pdu_active_port="2" pdu_passive_port="3" #check #check #check #check drac_active_ip="15.15.10.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities ------ ---active passive ---- -----1 Online, Local 2 Online 4. Start clvmd service on both servers to prepare the creation of physical volumes, volume group and logical volume in an HA cluster. Make sure locking_type is set to 3 in /etc/lvm/lvm.conf before executing the following command. # service clvmd start And make sure that service multipathd is running. 5.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities [root@active ~]# clustat Cluster Status for NSS61 @ Thu Jan Member Status: Quorate Member Name ------ ---active passive Service Name ------- ---service:HA1 5 10:41:48 2011 ID ---1 2 Status -----Online, rgmanager Online, Local, rgmanager Owner (Last) ----- -----active State ----started On the server that is running the service, check that the XFS file system is mounted.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities An SELinux policy can be generated from logs of denied operations. Check /var/log/audit/audit.log for denied operations. If there are none relating to the cluster, test fencing as described in the “Quick test of HA set-up” section and then follow the steps below. # grep avc /var/log/audit/audit.log | audit2allow -M NSSHApolicy Install the module on bot servers # semodule -i NSSHApolicy.pp Reference https://bugzilla.redhat.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.8. Quick test of HA setup 1. Test fencing: o First disable the cluster service: [root@active ~]# clusvcadm –d HA1 o From the active server run the command fence_node passive. This should power cycle the passive server via DRAC. Check /var/log/messages on active. o From passive run the command fence_node active. This should power cycle the active server via iDRAC. Check /var/log/messages on passive.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.9. Useful commands and references This section provides several commands for HA cluster configuration, management, and debug, and also gives the instructions to configure a storage array manually. A.9.1.Manually modify cluster configuration file If the /etc/cluster/cluster.conf file is edited manually, make the changes only on one server and increment the version number field.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities 3. Start cluster service. Assume the service name is HA1, it is disabled, and will be started on “passive” server. # clusvcadm –e HA1 –m passive 4. Relocate cluster service. Assume the service name is HA1, it is running on “passive” server, and will be relocated to “active” server. # clusvcadm –r HA1 –m active A.9.3.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities # service cman stop A.9.5.Configure the shared storage array manually If manual configuration of the storage array configuration is preferred instead of the using the script “nssha61_single.py” mentioned in Section A.7.2, step 5 , please follow the instructions below. 1. Before configuring a shared storage array, please make sure service clvmd is running on both servers, and make sure the step 7 in section A.7.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities size=27T features='3 queue_if_no_path pg_init_retries 50' hwhandler='1 rdac' wp=rw |-+- policy='round-robin 0' prio=6 status=active | `- 2:0:0:0 sdk 8:160 active ready running `-+- policy='round-robin 0' prio=1 status=enabled `- 1:0:0:0 sdc 8:32 active ghost running The LUN id is the last number of 2:0:0:x or 1:0:0:x. The order should be mpatha, mpathb, mpathd, mpathc.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities a. Configure 144TB storage array. # vgcreate VGMD1 /dev/mapper/mpatha /dev/mapper/mpathb /dev/mapper/mpathd /dev/mapper/mpathc # lvcreate -i 4 -I 1024 -l 100%FREE VGMD1 -n LVMD1 # mkfs.xfs -l size=128m /dev/VGMD1/LVMD1 # mount -o noatime,allocsize=1g,nobarrier,inode64,logbsize=262144,wsync /dev/VGMD1/LVMD1/mnt/xfs1/ b. Extend 144TB configuration to 288TB.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.10. Performance tuning on clients 1. If the clients access the NFS server via 10GbE, configure the MTU on the 10GbE device to be 8192 for all the clients. Note that the switches need to be configured to support large MTU as well.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities A.11. Scripts 1. /root/config_cluster.sh file for automatically generating cluster configuration file 2. /root/nssha61_single.py file for automatically configuring the shared storage. 3. /root/ibstat_script.sh file for InfiniBand clusters 4. /root/sas_path_check.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities Appendix B: Benchmarks and test tools The iozone benchmark was used to measure sequential read and write throughput (MB/sec) as well as random read and write I/O operations per second (IOPS). The mdtest benchmark was used to test metadata operation performance. The checkstream utility was used to test for data correctness under failure and failover cases.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities The following table describes the IOZone command line arguments.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities IOzone IOPs Random Access (Reads and Writes) # /usr/sbin/iozone -i 2 -w -r 4k -I -O -w -+n -s 2G -t 1 -+m ./clientlist By using -c and -e in the test, IOzone provides a more realistic view of what a typical application is doing. The O_Direct command line parameter allows us to bypass the cache on the compute node on which we are running the IOzone thread. B.2. mdtest mdtest can be downloaded from http://sourceforge.
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities As with the IOzone random access patterns, the following procedure was followed to minimize cache effects during the metadata testing: o Unmount NFS share on clients. o Stop the cluster service on the server. This umounts the XFS file system on the server. o Start the cluster service on the server. o Mount NFS Share on clients. Metadata file and directory creation test: # mpirun -np 32 --nolocal --hostfile .
Dell HPC NFS Storage Solution High Availability Configurations with Large Capacities For comparison, here is an example of a failing test with data corruption in the copied file. For example, if the file system is exported via the NFS async operation and there is an HA service failover during a write operation, data corruption is likely to occur.