HP IBRIX 9000 Storage Network Best Practices Guide Abstract This document describes recommended practices for HP IBRIX 9000 Storage networking. For the latest IBRIX guides, browse to http://www.hp.com/support/IBRIXManuals.
© Copyright 2012 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice.
Contents 1 Overview of HP IBRIX 9000 Series networking...............................................5 IBRIX components.....................................................................................................................5 Networks.................................................................................................................................5 FSN physical networking...........................................................................................................
Expanding an existing cluster.....................................................................63 5 Support and other resources......................................................................66 Contacting HP........................................................................................................................66 Related information.................................................................................................................66 HP websites.........................
1 Overview of HP IBRIX 9000 Series networking The IBRIX solution uses network attached components and associated software to implement a fault-tolerant distributed file system. This network-centric overview describes the components and networking concepts used to implement the IBRIX networking solution. Specific attention is given to the fault-tolerant aspects of the implementation, as these make the implementation more complicated than a typical network attached system.
For 1GbE configurations, HP strongly recommends that the cluster network be configured as a private network that is separate from the user data-serving network. For 10GbE configurations, HP recommends that the cluster network and user network be collapsed into a single network. • User network This network provides user client systems access to the file system through supported file access protocols such as NFS, SMB, FTP, and HTTP.
Figure 1 Bond redundancy Each bond interface has an associated mode property that controls the policy for routing network traffic between the bonded interface and the aggregated physical interfaces. Linux bonding defines six distinct modes that provide different degrees of load balancing and fault tolerance. See “BOND modes” (page 68) for a description of the modes that can be employed in IBRIX platforms.
Fusion Manager VIF failover The Fusion Manager uses a VIF to enable it to fail over across the cluster. A single cluster-wide IP address is chosen for the Fusion Manager. The FSN that is running the active Fusion Manager then establishes an active VIF for the Fusion Manager’s IP address. When the Fusion Manager needs to fail over to a different FSN, the following occurs: 1. The original FSN hosting the active Fusion Manager disables its FM VIF. 2.
To see how this maps onto each FSN, check the output of the Linux ifconfig command for each FSN. Only the active interfaces appear in the ifconfig output. VIFs and failover are handled at the IBRIX solution layer, which dynamically creates and removes the necessary interfaces on the FSNs to react to conditions that cause failover.
Sample ifconfig output for FSN 2: Figure 3 and the corresponding command output illustrate the state of the FSNs after the Fusion Manager has migrated from FSN 1 to FSN 2. The only change is the Fusion Manager VIF (bond0:0) is now inactive on FSN 1 and active on FSN 2.
After FM migration, FSN 1 has two active interfaces: • bond0 is the cluster network interface • bond0:0 is not active; the Fusion Manager could move back to this interface if requested • bond0:2 is a user network VIF for file serving from FSN 1 • bond0:3 is not active, but is provisioned for failover of file serving duties from FSN 2 After FM migration, FSN 2 has two active interfaces: • bond0 is the cluster network interface • bond0:0 is handling requests for the active Fusion Manager • bond0:
Sample ifconfig output for FSN 2after migrating the FM from FSN 1 to FSN 2: User VIF failover To provide high availability to File Clients, HP recommends that IBRIX user networks use VIFs for File Client requests. One file serving node is selected as the primary FSN for the VIF and File Clients then use this VIF when requesting files. The primary FSN is usually the node that returns those files.
Example FSN failover The following example illustrates what occurs when a file serving node is forced to fail over. The example shows an IBRIX 9730 platform in a unified network topology. All IP addresses and other identifiers have been chosen for illustration purposes and could be different on a customer installation. In this example, the cluster starts with two active file serving nodes. FSN 1 has the active Fusion Manager and is serving files from IP address 10.30.214.202.
Sample ifconfig output for FSN 1: 14 Overview of HP IBRIX 9000 Series networking
Sample ifconfig output for FSN 2: Figure 5 and the corresponding command output illustrate the state of the FSNs after FSN 1 has failed and failover has occurred to FSN 2. The Fusion Manager VIF (bond0:0) and the FSN 1 User VIF (bond0:2) have both moved to FSN 2.
After failover, FSN 1 has one active interface and three standby interfaces: • bond0 is the cluster network interface. Notice this does not change during a failover. • bond0:0 is not active. If FSN 1 is recovered, the active Fusion Manager could move back to here. • bond0:2 is not active; FSN 2 has taken over this IP address. If FSN 1 is recovered, its file serving could fail back to this interface.
Output of ifconfig on FSN 1 after failover from FSN 1 to FSN 2: User VIF failover 17
Output of ifconfig on FSN 1 after failover from FSN 1 to FSN 2: Cluster network implications The cluster network’s role in cluster management makes it undesirable to have the same failover characteristics as a user network. Instead, it is important that each FSN has a unique identity on the cluster network. If a FSN is degraded and the user network fails over to a backup FSN, the degraded FSN can still be monitored and managed using the cluster network.
Table 1 FSN customer integration features Feature Description Additional user network Beyond the single user VIF that is required to support FSN failover, IBRIX supports adding VIFs additional user VIFs to an interface. The customer can use this VIF to associate additional IP address ranges with a FSN. The IP address ranges can then be used to assert control over the network traffic originating from that interface.
Figure 6 Two user VIFs and associated failover pairs When this configuration is operating normally, normally, the FSNs are set up as follows. FSN 1 has four active interfaces and two standby interfaces: • bond0 is the cluster network interface. • bond0:0 is a cluster network VIF for the active Fusion Manager. • bond0:2 and bond0:4 are the two active User VIFs for file serving from FSN 1. • bond0:3 and bond0:5 are not active, but are provisioned for failover of file serving duties from FSN 2.
From the point of view of the IBRIX platform, this duplicates many of the capabilities provided by multiple user VIFs, and VLANs can be used in much the same way as user VIFs. But from the point of view of the network administrator, VLANs potentially provide a more flexible approach to managing the network, due to the potential VLAN support already built into the customer’s intermediate networking infrastructure.
When operating normally, FSN 1 has four active interfaces and two standby interfaces: Interfaces Network Active Function VLAN VIF bond0.20 Cluster Yes Cluster Network Interface for FSN Yes (VLAN Tag 20) No 1 bond0.20:0 Cluster Yes Fusion Manager Yes (VLAN Tag 20) Yes (Base: bond0.20) bond0.30 User 1 Yes File Serving on VLAN 30 from FSN 1 Yes (VLAN Tag 30) No bond0.30:2 User 1 No Failover for File Serving on VLAN Yes (VLAN Tag 30) Yes (Base: bond0.30) 30 from FSN 2 bond0.
Appropriate ifconfig output from FSN 1: Appropriate iconfig output from FSN 2: FSN customer integration features 23
Link Aggregation Control Protocol (LACP) trunking Link Aggregation Control Protocol is an IEEE standard that provides a method to control the bundling of several physical interfaces to form a single logical channel. LACP works by sending frames (LACPDUs) down all links that have the protocol enabled.
In addition to the Linux bond trunking implemented primarily for fault-protection, some IBRIX 9000 components also support the lower level LACP trunking mechanism to increase the maximum bandwidth available to a network connection. In 9720/9730 systems, LACP support is enabled in the Virtual Connect modules to allow an increase of the maximum bandwidth available between the c7000 enclosure and the customer network.
2 IBRIX 9730 platform networking The IBRIX 9730 platform uses a c7000 enclosure with the following components: • A rack-mounted enclosure frame with a midplane for routing power and signals between the bays in the enclosure. • Bays for two Onboard Administrator modules. • Bays for two Virtual Connect Interconnect modules. • Bays for four 6G SAS switch Interconnect modules. • Bays for up to eight pairs of server blades.
Interconnect Module: Virtual Connect (VC) The virtual connect modules are responsible for configuring and routing all of the network traffic between the enclosure’s servers and the external customer network. The c7000 enclosure dedicates interconnect bays 1 and 2 to the Virtual Connect modules. Two modules in a master-slave relationship are used to achieve path redundancy between the server blades and the external network.
Unified network The unified network is the recommended default configuration for IBRIX 9730 networking. This configuration combines the cluster, user, and management networks onto a single IP network. Subnets segregate the management network from the cluster and user networks, while still allowing access for remote management. Modern high-speed networking hardware makes this combination possible without unduly affecting performance.
Logical description Figure 8 IBRIX 9730 unified network — logical view Logically, the unified configuration is a single, large IP network with a separate subnet for the 9730 management components. In Figure 8, the IP addresses show the relationship between components and subnets. (The subnet mask and IP address assignments shown here are for illustration purposes only; customer address allocation schemes can be done differently.
When correctly configured, the following should work: • FSNs should be able to ICMP ping all of the other network attached components, both on the user/cluster subnet and the management subnet. • File Clients should be able to ICMP ping the FSNs and the components on the management subnet. • The protocols listed in Table 3 (page 25) must work between the components and should not be blocked by intervening firewalls or routers.
Physical description Figure 9 IBRIX 9730 unified network — physical description The physical connection from the enclosure to the customer network is routed through either the VC modules or the OA modules. Network traffic between the enclosure and the customer network must traverse these modules to exit the enclosure. All other physical connections are routed internally via the enclosure’s midplane.
Example of packet traversals This section uses sample packet traversals through the physical connections to illustrate the flow of network traffic. Packet traversing from FSN to external user network: 1. The packet is created in the application layer of the FSN and is destined for a file client on the user network. 2. The O/S determines by IP address that the user network can be reached through the bond0 interface and queues the packet for that interface. 3.
FSN physical hardware mapping Table 5 shows the mapping of the logical networks to the server blade and VC interconnect hardware.
Table 6 IBRIX 9730 unified network — cabling summary Connection Origin Destination Virtual Connect Module 1 Interconnect bay 1 – Connector X6 Customer Edge Switch – User/Cluster subnet Virtual Connect Module 2 Interconnect bay 2 – Connector X6 Customer Edge Switch – User/Cluster subnet Onboard Administrator 1 OA1 – RJ-45 ( Labeled: iLO ) Customer Edge Switch – Management subnet Onboard Administrator 2 OA2 – RJ-45 ( Labeled: iLO ) Customer Edge Switch – Management subnet Increasing external bandwi
Verifying the network configuration Run the following commands on an FSN to verify and troubleshoot the unified network configuration. Sample output for a correctly configured enclosure is provided. Verify Virtual Connect configuration. Use the vc_config_check.py script to verify the enclosure VC modules are configured as expected: Verify FSN network interface devices.
Dedicated management network The dedicated management network configuration consists of two distinct networks. One network carries the cluster and user network traffic. A separate network is dedicated to the management network traffic. The dedicated management network is equivalent to the standard IBRIX 9720 configuration, which shipped with dedicated switches for the management network.
Logical description Figure 12 IBRIX 9730 dedicated management network — logical view The dedicated management network configuration logically is two separate IP networks with one network for the user/cluster traffic and a separate network for the management traffic. In this configuration, the file serving nodes are equipped with two separate bonds: bond0 provides a direct connection to the management network, and bond1 provides the connection to the user/cluster network.
When correctly configured, the following should work: • FSNs should be able to ICMP ping all other network attached components, both on the user/cluster subnet and the management subnet. • File Clients should be able to ICMP ping the FSNs. • The protocols listed in Table 3 (page 25) must work between the components, and should not be blocked by intervening firewalls or routers. IP address usage In this configuration, the IBRIX 9730 requires the IP addresses listed in Table 7.
Physical description Figure 13 IBRIX 9730 dedicated management network — physical view The physical connection from the enclosure to the customer network is routed through either the VC modules or the OA modules. Network traffic between the enclosure and the customer network must traverse these modules to exit the enclosure. All other physical connections are routed internally via the enclosure’s midplane.
The VC modules are cross linked through connectors X7 and X8 to provide redundant paths from the enclosure midplane to the external network. The connection of X7 and X8 is internally implemented on the enclosure midplane. External cables should not be plugged into VC connections X7 and X8. Example of packet traversals This section provides sample packet traversals through the physical connections to illustrate the flow of network traffic. Packet traversing from FSN to external user network: 1.
6. 7. The OA determines that the packet needs to be routed to the enclosure's SAS interconnect. It forwards the packet via the enclosure midplane to the management interface on the target SAS interconnect. The embedded system on the SAS interconnect module receives and processes the packet. FSN physical hardware mapping Table 8 shows the mapping of the logical networks to the server blade and VC interconnect hardware.
The six patch cables shown in Figure 14 are the minimum cabling required to attach an IBRIX 9730 enclosure in the dedicated management network configuration. For maximum redundancy, each patch cable ideally connects to a physically separate edge switch in the customer network. Table 9 summarizes the necessary connections.
Verifying the network configuration The following commands be run on an FSN to verify and troubleshoot the network configuration. Sample output for a correctly configured enclosure is provided. Verify Virtual Connect configuration. Use the vc_config_check.py script to verify that the enclosure VC modules are configured as expected: [prompt ~]# vc_config_check.
eth3 Link encap:Ethernet HWaddr 68:B5:99:B2:FC:D5 UP BROADCAST RUNNING SLAVE MULTICAST MTU:1500 Metric:1 RX packets:23177897 errors:0 dropped:0 overruns:0 frame:0 TX packets:1762308 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:2410457366 (2.2 GiB) TX bytes:105730002 (100.8 MiB) Memory:f3ea0000-f3ec0000 lo Link encap:Local Loopback inet addr:127.0.0.1 Mask:255.0.0.
Logical description Figure 16 IBRIX 9730 additional physical user networks — logical view Figure 16 illustrates a configuration with two additional user networks. The configuration is similar to the dedicated management network, but additional bonded interfaces (bond2 and bond3) are added to support the connections to two additional physical networks. The additional networks are used for the attachment of file clients.
When correctly configured the following should work: • FSNs should be able to ICMP ping all of the other network attached components, both on the user/cluster subnet and the management subnet. • File Clients should be able to ICMP ping the FSNs. • The protocols listed in Table 3 (page 25) must work between the components, and should not be blocked by intervening firewalls or routers.
Physical description Figure 17 IBRIX 9730 one additional physical user network — physical view Additional physical user networks 47
Figure 18 IBRIX 9730 two additional physical user networks — physical view Adding a physical user network results in an additional bonded interface on the FSN and a corresponding redundant set of connections to switches in the customer network. For one additional network, the FSN gains a bond2 interface and two external connections. For two additional networks, the FSN gains bond2 and bond3 interfaces and four external connections.
To make the diagrams clearer, tigures 17 and 18 do not show the internal management and VC cross link connections. In the physical implementation, those internal connections are present. The VC modules are cross linked through connectors X7 and X8 to provide redundant paths from the enclosure midplane to the external network. The connection of X7 and X8 is internally implemented on the enclosure midplane. External cables should not be plugged into VC connections X7 and X8.
Table 13 IBRIX 9730 — mapping for two additional physical user networks (continued) Network FSN Interface Allocated Bandwidth FSN Physical Interface VC Module VC External Connection User2 Bond2 3.5 Gb eth 4 Interconnect Bay 1 X4 X3 eth 5 Interconnect Bay 2 X4 X3 User3 Bond3 3.
Table 14 IBRIX 9730 one additional physical user network — cabling summary Connection Origin Destination Virtual Connect Module 1, Management Connection Interconnect bay 1 – Connector X6 Customer Edge Switch – Management network Virtual Connect Module 1, User/Cluster Connection Interconnect bay 1 – Connector X3 Customer Edge Switch – User/Cluster network Virtual Connect Module 2, Management Connection Interconnect bay 2 – Connector X6 Customer Edge Switch – Management network Virtual Connect Mod
Table 15 IBRIX 9730 two additional physical user networks — cabling summary Connection Origin Destination Virtual Connect Module 1, Management Connection Interconnect bay 1 – Connector X6 Customer Edge Switch – Management network Virtual Connect Module 1, User/Cluster Connection Interconnect bay 1 – Connector X2 Customer Edge Switch – User/Cluster network Virtual Connect Module 2, Management Connection Interconnect bay 2 – Connector X6 Customer Edge Switch – Management network Virtual Connect Mo
Figure 22 IBRIX 9730 two additional physical user networks – LACP trunk cabling Additional physical user networks 53
3 IBRIX 93xx platform networking The IBRIX 93xx platform uses discrete servers and storage components to form a storage solution. A minimal 93xx installation has the following components: • Two rack mounted servers with expansion cards supporting SAS and high-speed Ethernet connections. • One MSA storage enclosure.
This section includes the following information: • Motivation. A discussion of the motivations for choosing the topology. • Logical description. The topology at the IP addressing layer, showing the network attached components that must be able to interchange packets. The preferred segregation of network traffic to match the expected usage of each component is also discussed. Sample IP addresses show the relationships between components. • Physical description.
Logical description Figure 23 IBRIX 93xx unified network — logical view Logically, the unified configuration is a single, large IP network with a separate subnet for the 93xx management components. In Figure 23, the example IP addresses were chosen to illustrate the relationship between components and subnets. The subnet mask and IP address assignments shown here are for illustration purposes only. Customer address allocation schemes can be done differently.
Table 16 IBRIX 93xx unified network — IP addresses (continued) Component Subnet Minimum number of addresses Maximum number of addresses User VIF user 2 Variable. At least 1 per FSN (2 included in total) 9 for minimum configuration 9 for maximum configuration Customer use of VLANs or multiple user network VIFs can require additional IP addresses beyond those specified in this table.
Packet traversing from FSN to external user network: 1. The packet is created in the application layer of the FSN and is destined for a file client on the user network. 2. The O/S determines by IP address that the user network can be reached through the bond0 interface and queues the packet for that interface. 3. The bond driver determines which of its two underlying physical interfaces are up and functioning correctly.
Table 18 IBRIX 93xx unified configuration – 2x 10 Gb NIC physical mapping Network User, cluster, management Allocated FSN Interface Bandwidth FSN Physical Interface Connection Type Bond0 eth 4 Patch cable to customer edge switch eth 5 Patch cable to customer edge switch eth 6 Patch cable to customer edge switch eth 7 Patch cable to customer edge switch 10 Gb Physical cabling Figure 25 IBRIX 93xx unified network — server cabling The minimum cabling required for attachment of a single IBRIX 93x
Figure 26 IBRIX 93xx unified network — MSA enclosure cabling The minimum cabling required for attachment of a single MSA enclosure to the management network consists of the two patch cables shown in Figure 26. Each SAS controller has its own management port that must be connected to the customer provided edge switch for the management subnet. HP IBRIX 93xx 6.2 QR-DVD.iso installation network defaults By default, the 6.2 QR-DVD.
Table 19 IBRIX 93xx unified network configuration — cabling summary (continued) Connection Origin Destination Server 2, Datapath 2 Rear of Server 2, Expansion slot, eth 5 SFP connector Customer Edge Switch – User/Cluster subnet Server 2, iLO Rear of Server 1, Built-in iLO RJ-45 connector Customer Edge Switch – Management subnet MSA enclosure, Controller A Rear of MSA enclosure, Upper controller slot, Integrated RJ-45 connector Customer Edge Switch – Management subnet MSA enclosure, Controller B
IBRIX 93xx platform networking
4 Expanding an existing cluster When adding a new installation to an existing cluster, HP recommends that the new installation conform to the network topology of the existing cluster. Since earlier IBRIX platforms were often shipped with dedicated ProCurve switches for connecting the management components, this implies that the new system should implement the dedicated management network topology and should use the existing ProCurve switches for its management network.
Figure 28 IBRIX 9720 expansion with 9730 —physical network 64 Expanding an existing cluster
Figure 29 IBRIX 9720 expansion with 9730 — cabling 65
5 Support and other resources Contacting HP For worldwide technical support information, see the HP support website: http://www.hp.
6 Documentation feedback HP is committed to providing documentation that meets your needs. To help us improve the documentation, send any errors, suggestions, or comments to Documentation Feedback (docsfeedback@hp.com). Include the document title and part number, version number, or the URL when submitting your feedback.
A BOND modes Table 20 BOND mode descriptions Mode Mode Name Mode Description 1 active-backup Active-backup policy: Only one slave in the bond is active. A different slave becomes active if, and only if, the active slave fails. The bond's MAC address is externally visible on only one port (network adapter) to avoid confusing the switch. In bonding version 2.6.2 or later, when a failover occurs in active-backup mode, bonding issues one or more gratuitous ARPs on the newly active slave.
B IBRIX 93xx 10 GbE bonding modes and switch interconnection This appendix discusses supported bond modes versus the connection topology of the customer edge switch. The connections described here are all originating from a single file serving node. Table 20 describes the switch topologies.
C Install and the default Virtual Connect configuration Before an initial install is performed on a c7000 enclosure, its Virtual Connect modules are in the factory default configuration. Table 22 lists the default configuration and its implications for the installer.