HP StorageWorks X9720 Network Storage System Administrator Guide This guide describes tasks related to cluster configuration and monitoring, system upgrade and recovery, hardware component replacement, and troubleshooting. It does not document X9000 file system features or standard Linux administrative tools and commands. For information about configuring and using X9000 Software file system features, see the HP StorageWorks X9000 File Serving Software File System User Guide.
Legal and notice information © Copyright 2009-2010 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice.
Contents 1 Product description ........................................................................... 11 HP X9720 Network Storage System features ................................................................................. System components ................................................................................................................... HP X9000 Software features ......................................................................................................
Agile management consoles ....................................................................................................... Agile management console modes ....................................................................................... Agile management consoles and failover .............................................................................. Viewing information about management consoles ................................................................... Cluster high availability .....
Adding an X9000 client to a hostgroup ....................................................................................... Adding a domain rule to a hostgroup .......................................................................................... Viewing hostgroups ................................................................................................................... Deleting hostgroups ........................................................................................................
12 Upgrading the X9000 Software ....................................................... 81 Automatic upgrades .................................................................................................................. Manual upgrades ..................................................................................................................... Standard upgrade for clusters with a dedicated Management Server machine or blade ............... Standard online upgrade ...............................
exds_netperf ............................................................................................................. POST error messages .............................................................................................................. LUN layout ............................................................................................................................. X9720 monitoring ........................................................................................................
Replacing the X9700cx fan ............................................................................................... 144 Replacing a SAS cable ..................................................................................................... 144 18 Recovering the X9720 Network Storage System ............................... 145 Starting the recovery ............................................................................................................... Configuring a file serving node ........
AW550A—X9700 Blade Server .............................................................................................. 191 AW551A—X9700 Capacity Block (X9700c and X9700cx) ......................................................... 191 C Warnings and precautions .............................................................. 193 Electrostatic discharge information ............................................................................................ Grounding methods ..................................
Spanish notice ................................................................................................................. 210 Swedish notice ................................................................................................................. 211 Glossary .......................................................................................... 213 Index ...............................................................................................
1 Product description HP StorageWorks X9720 Network Storage System is a scalable, network-attached storage (NAS) product. The system combines HP X9000 File Serving Software with HP server and storage hardware to create a cluster of file serving nodes.
IMPORTANT: All software that is included with the X9720 Network Storage System is for the sole purpose of operating the system. Do not add, remove, or change any software unless instructed to do so by HP-authorized personnel. For more information about system components and cabling, see Appendix A.
2 Getting started This chapter describes how to log into the system, how to boot the system and individual server blades, how to change passwords, and how to back up the management console configuration. It also describes the management interfaces provided with X9000 Software.
• NDMP backups. These cluster features are described later in this guide. File systems. Set up the following features as needed: • Additional file systems. Optionally, configure data tiering on the file systems to move files to specific tiers based on file attributes. • NFS, CIFS, FTP, or HTTP. Configure the methods you will use to access file system data. • Quotas. Configure user, group, and directory tree quotas as needed. • Remote replication.
Using the serial link on the Onboard Administrator If you are connected to a terminal server, you can log in through the serial link on the Onboard Administrator. Booting the system and individual server blades Before booting the system, ensure that all of the system components other than the server blades—the capacity blocks and so on—are turned on. By default, server blades boot whenever power is applied to the X9720 Network Storage System performance chassis (c-Class Blade enclosure).
http://:80/fusion If you are using HTTPS to access the GUI, navigate to the following location, specifying port 443: https://:443/fusion In these URLs, is the IP address of the management console user VIF. The GUI prompts for your user name and password. The default administrative user is ibrix. Enter the password that was assigned to this user when the system was installed. (You can change the password using the Linux passwd command.
The GUI dashboard enables you to monitor the entire cluster. There are three parts to the dashboard: System Status, Cluster Overview, and the Navigator. System Status The System Status section lists the number of cluster events that have occurred in the last 24 hours. There are three types of events: Alerts. Disruptive events that can result in loss of access to file system data. Examples are a segment that is unavailable or a server that cannot be accessed. Warnings.
Whether the specified file system services are currently running: One or more tasks are running. No tasks are running. Statistics Historical performance graphs for the following items: • Network I/O (MB/s) • Disk I/O (MB/s) • CPU usage (%) • Memory usage (%) On each graph, the X-axis represents time and the Y-axis represents performance.
NOTE: When you perform an operation on the GUI, a spinning finger is displayed until the operation is complete. However, if you use Windows Remote Desktop to access the management console, the spinning finger is not displayed. Customizing the GUI For most tables in the GUI, you can specify the columns that you want to display and the sort order of each column. When this feature is available, mousing over a column causes the label to change color and a pointer to appear.
Adding user accounts for GUI access X9000 Software supports administrative and user roles. When users log in under the administrative role, they can configure the cluster and initiate operations such as remote replication or snapshots. When users log in under the user role, they can view the cluster configuration and status, but cannot make configuration changes or initiate operations. The default administrative user name is ibrix. The default regular username is ibrixuser.
Using the Windows X9000 client GUI The Windows X9000 client GUI is the client interface to the management console. To open the GUI, double-click the desktop icon or select the IBRIX Client program from the Start menu on the client. The client program contains tabs organized by function. NOTE: The Windows X9000 client application can be started only by users with Administrative privileges. • Status.
Getting started
3 Configuring the firewall IMPORTANT: To avoid unintended consequences, HP recommends that you perform the procedures in this chapter during scheduled maintenance times. Firewall concepts The X9720 Network Storage System uses iptables to implement a firewall on each server. The iptables rules allow the servers to communicate with each other and to provide appropriate NAS services to other systems. Unless needed for a specific purpose, the general policy is to disallow traffic on all ports.
Port Description Remote management 22/tcp Allows ssh access to the servers in the system. 9022/tcp Allows ssh access to the Onboard Administrator (OA) from a remote system. 12865/tcp Allows the exds_netperf tool running on a client to access the X9720 Network Storage System.
Port Description The following section describes services that normally operate across the management network (bond0). However, if the management network is down, these services use the site network. The firewall must not prevent X9720 Network Storage System servers in the same system from communicating via these ports (but can block other systems).
1. Determine the rule number of the http rule by running the iptables list command as follows: # iptables -L MXSO-External-Filter This produces output like: Counting down the rule set, the http rule is rule number 3. 2. Having identified the rule to be updated, it can be replaced with the rule that limits requests to those with a source address on the 16.123.8 subnet with: # iptables -R MXSO-External-Filter 3 -p tcp -m tcp -s 16.123.8.0/24 --dport 80 -j ACCEPT # service iptables save 3.
4 Configuring virtual interfaces for client access X9000 Software uses a cluster network interface to carry management console traffic and traffic between file serving nodes. This network is configured as bond0 when the cluster is installed. For clusters with an agile management console configuration, a virtual interface is also created for the cluster network interface to provide failover support for the console.
3. Assign an IP address to the bond1:1 VIFs on each node: # # # # ibrix_nic ibrix_nic ibrix_nic ibrix_nic –c –c –c –c –n –n –n –n bond1:1 bond1:1 bond1:1 bond1:1 –h –h –h –h node1 node2 node3 node4 –I –I –I –I 16.123.200.201 16.123.200.202 16.123.200.203 16.123.200.204 –M –M –M –M 255.255.255.0 255.255.255.0 255.255.255.0 255.255.255.0 Configuring standby backup nodes Assign standby backup nodes for the bond1:1 interface. The backup nodes should be configured in pairs.
clients could connect to bond1 on either host, as these clients do not support or require NIC failover. (The following sample output shows only the relevant fields.
Configuring virtual interfaces for client access
5 Configuring failover This chapter describes how to configure failover for agile management consoles, file serving nodes, network interfaces, and HBAs. Agile management consoles The management console maintains the cluster configuration and provides graphical and command-line user interfaces for managing and monitoring the cluster. Typically, one active management console and one passive management console are installed when the cluster is installed.
• Management console GUI. You will need to reconnect to the management console VIF after the failover. Failing over the management console manually To fail over the active management console manually, place the console into maintenance mode. Enter the following command on the node hosting the console: ibrix_fm -m maintenance The command takes effect immediately.
When automated failover is enabled, the management console listens for heartbeat messages that the file serving nodes broadcast at one-minute intervals. The management console automatically initiates failover when it fails to receive five consecutive heartbeats or, if HBA monitoring is enabled, when a heartbeat message indicates that a monitored HBA or pair of HBAs has failed.
• • • • The management console must have access to both the primary server and its standby. The same file system must be mounted on both the primary server and its standby. A server identified as a standby must be able to see all segments that might fail over to it. In a SAN environment, a primary server and its standby must use the same storage infrastructure to access a segment’s physical volumes (for example, a multiported RAID array).
APC power source. To identify an APC power source, use the following command: /bin/ibrix_powersrc -a -t {apc|apc_msp} -h POWERSRCNAME -n NUMSLOTS -I IPADDR For example, to identify an eight-port APC power source named ps1 at IP address 192.168.3.150: /bin/ibrix_powersrc -a -t apc -h ps1 -n 8 -I 192.168.3.150 For APC power sources, you must also associate file serving nodes to power source slots.
/bin/ibrix_powersrc -d -h POWERSRCLIST Turning automated failover on and off Automated failover is turned off by default. When automated failover is turned on, the management console starts monitoring heartbeat messages from file serving nodes. You can turn automated failover on and off for all file serving nodes or for selected nodes.
After failing back the node, determine whether the failback completed fully. If the failback is not complete, contact HP Support for assistance. NOTE: A failback might not succeed if the time period between the failover and the failback is too short, and the primary server has not fully recovered. HP recommends ensuring that both servers are up and running and then waiting 60 seconds before starting the failback. Use the ibrix_server -l command to verify that the primary server is up and running.
Identifying standbys To protect a network interface, you must identify a standby for it on each file serving node that connects to the interface. The following restrictions apply when identifying a standby network interface: • The standby network interface must be unconfigured and connected to the same switch (network) as the primary interface. • The file serving node that supports the standby network interface must have access to the file system that the clients on that interface will mount.
/bin/ibrix_nic -m -h MONHOST -D DESTHOST/IFNAME Deleting standbys To delete a standby for a network interface, use the following command: /bin/ibrix_nic -b -U HOSTNAME1/IFNAME1 For example, to delete the standby that was assigned to interface eth2 on file serving node s1.hp.com: /bin/ibrix_nic -b -U s1.hp.com/eth2 Setting up HBA monitoring You can configure High Availability to initiate automated failover upon detection of a failed HBA.
Identifying standby-paired HBA ports Identifying standby-paired HBA ports to the configuration database allows the management console to apply the following logic when they fail: • If one port in a pair fails, do nothing. Traffic will automatically switch to the surviving port, as configured by the vendor or the software. • If both ports in a pair fail, fail over the server’s segments to the standby server.
/bin/ibrix_hba -l [-h HOSTLIST] The following table describes the fields in the output. Field Description Host Server on which the HBA is installed. Node WWN This HBA’s WWNN. Port WWN This HBA’s WWPN. Port State Operational state of the port. Backup Port WWN WWPN of the standby port for this port (standby-paired HBAs only). Monitoring Whether HBA monitoring is enabled for this port.
/bin/ibrix_haconfig -l -h xs01.hp.com,xs02.hp.com Host HA Configuration Power Sources Backup Servers Auto Failover Nics Monitored Standby Nics HBAs Monitored xs01.hp.com FAILED PASSED PASSED PASSED FAILED PASSED FAILED xs02.hp.
6 Configuring cluster event notification Setting up email notification of cluster events You can set up event notifications by event type or for one or more specific events. To set up automatic email notification of cluster events, associate the events with email recipients and then configure email settings to initiate the notification process.
Turning email notifications on or off After configuration is complete, use the -m on option to turn on email notifications. To turn off email notifications, use the -m off option.
NOTE: Users of software versions earlier than 4.3 should be aware that the single ibrix_snmp command has been replaced by two commands, ibrix_snmpagent and ibrix_snmptrap. If you have scripts that include ibrix_snmp, be sure to edit them to include the correct commands. Whereas SNMPV2 security was enforced by use of community password strings, V3 introduces the USM and VACM. Discussion of these models is beyond the scope of this document. Refer to RFCs 3414 and 3415 at http://www.ietf.
The update command for SNMPv1 and v2 uses optional community names. By convention, the default READCOMMUNITY name used for read-only access and assigned to the agent is public. No default WRITECOMMUNITY name is set for read-write access (although the name private is often used). The following command updates a v2 agent with the write community name private, the agent’s system name, and that system’s physical location: ibrix_snmpagent -u –v 2 -w private -n agenthost.domain.
/bin/ibrix_event -c -y SNMP [-e ALERT|INFO|EVENTLIST] -m TRAPSINK For example, to associate all Alert events and two Info events with a trapsink at IP address 192.168.2.32, enter: /bin/ibrix_event -c -y SNMP -e ALERT,server.registered, filesystem.created -m 192.168.2.
ibrix_snmpgroup -c -g GROUPNAME [-s {noAuthNoPriv|authNoPriv|authPriv}] [-r READVIEW] [-w WRITEVIEW] [-x CONTEXT_NAME] [-m {exact|prefix}] For example, to create the group group2 to require authorization, no encryption, and read access to the hp view, enter: ibrix_snmpgroup -c -g group2 -s authNoPriv -r hp The format to create a user and add that user to a group follows: ibrix_snmpuser -c -n USERNAME -g GROUPNAME [-j {MD5|SHA}] [-k AUTHORIZATION_PASSWORD] [-y {DES|AES}] [-z PRIVACY_PASSWORD] Authenticati
7 Configuring system backups Backing up the management console configuration The management console configuration is automatically backed up whenever the cluster configuration changes. The backup takes place on the node hosting the active management console (or on the Management Server, if a dedicated management console is configured). The backup file is stored at /tmp/fmbackup.zip on the machine where it was created.
• Three-way NDMP operations between two X9300/X9320/X9720 systems Each file serving node functions as an NDMP Server and runs the NDMP Server daemon (ndmpd) process. When you start a backup or restore operation on the DMA, you can specify the node and tape device to be used for the operation.
NDMP process management Normally all NDMP actions are controlled from the DMA. However, if the DMA cannot resolve a problem or you suspect that the DMA may have incorrect information about the NDMP environment, take the following actions from the X9000 Software management console GUI or CLI: • Cancel one or more NDMP sessions on a file serving node. Canceling a session kills all spawned sessions processes and frees their resources if necessary. • Reset the NDMP server on one or more file serving nodes.
Viewing or rescanning tape and media changer devices To view the tape and media changer devices currently configured for backups, select Cluster Configuration from the Navigator, and then select NDMP Backup > Tape Devices. If you add a tape or media changer device to the SAN, click Rescan Device to update the list. If you remove a device and want to delete it from the list, you will need to reboot all of the servers to which the device is attached.
8 Creating hostgroups for X9000 clients A hostgroup is a named set of X9000 clients. Hostgroups provide a convenient way to centrally manage clients using the management console. You can put different sets of clients into hostgroups and then perform the following operations on all members of the group: • • • • • Create and delete mountpoints Mount file systems Prefer a network interface Tune host parameters Set allocation policies Hostgroups are optional.
hostgroups. To do this, mount ifs1 on the clients hostgroup, ifs2 on hostgroup A, ifs3 on hostgroup C, and ifs4 on hostgroup D, in any order. Then, set Tuning 1 on the clients hostgroup and Tuning 2 on hostgroup B. The end result is that all clients in hostgroup B will mount ifs1 and implement Tuning 2. The clients in hostgroup A will mount ifs2 and implement Tuning 1. The clients in hostgroups C and D respectively, will mount ifs3 and ifs4 and implement Tuning 1.
IP address that corresponds to a client network. Adding a domain rule to a hostgroup restricts its members to X9000 clients that are on the specified subnet. You can add a domain rule at any time. To add a domain rule to a hostgroup, use the ibrix_hostgroup command as follows: /bin/ibrix_hostgroup -a -g GROUPNAME -D DOMAIN For example, to add the domain rule 192.168 to the finance group: /bin/ibrix_hostgroup -a -g finance -D 192.
Creating hostgroups for X9000 clients
9 Monitoring cluster operations Monitoring the X9720 Network Storage System status The X9720 storage monitoring function gathers X9720 system status information and generates a monitoring report. The X9000 management console displays status information on the dashboard. This section describes how to the use the CLI to view this information. Monitoring intervals The monitoring interval is set by default to 15 minutes (900 seconds).
node1 node2 Up, HBAsDown Up, HBAsDown 0 0 0.00 0.00 0.00 0.00 off off File serving nodes can be in one of three operational states: Normal, Alert, or Error. These states are further broken down into categories that are mostly related to the failover status of the node. The following table describes the states. State Description Normal Up: Operational. Up-Alert: Server has encountered a condition that has been logged.
• View events by type: /bin/ibrix_event -q [-e ALERT|WARN|INFO] • View generated events on a last-in, first-out basis: /bin/ibrix_event -l • View adesignated number of events. The command displays the 100 most recent messages by default. Use the -n EVENTS_COUNT option to increase or decrease the number of events displayed.
• Failed. One or more tested hosts failed a health check. The health status of standby servers is not included when this result is calculated. • Warning. A suboptimal condition that might require your attention was found on one or more tested hosts or standby servers.
=============== Overall Result ============== Result Type State Network Thread Protocol ------ ------ ----------------------- ------ -------PASSED Server Up, HBAsDown 99.126.39.72 16 true CPU Information =============== Cpu(System,User,Util,Nice) -------------------------0, 1, 1, 0 Memory Information ================== Mem Total Mem Free --------- -------1944532 1841548 Module Up time Last Update ------ --------- ---------------------------- Loaded 3267210.
Check Description ---------------------------------------------------------------------------------------------- -----lab15-61 engine uuid matches on Iad and Fusion Manager lab15-61 IP address matches on Iad and Fusion Manager lab15-61 network protocol matches on Iad and Fusion Manager lab15-61 engine connection state on Iad is up lab15-62 engine uuid matches on Iad and Fusion Manager lab15-62 IP address matches on Iad and Fusion Manager lab15-62 network protocol matches on Iad and Fusion Manager lab15-62 e
To view the statistics from the CLI, use the following command: /bin/ibrix_stats -l [-s] [-c] [-m] [-i] [-n] [-f] [-h HOSTLIST] Use the options to view only certain statistics or to view statistics for specific file serving nodes: -s Summary statistics -c CPU statistics -m Memory statistics -i I/O statistics -n Network statistics -f NFS statistics -h The file serving nodes to be included in the report Sample output follows: ---------Summary-----------HOST Status CPU Disk(MB/s) Net(MB/s) l
Monitoring cluster operations
10 Maintaining the system Shutting down the system To shut down the system completely, first shut down the X9000 software, and then power off the X9720 hardware. Shutting down the X9000 Software Use the following procedure to shut down the X9000 Software. Unless noted otherwise, run the commands from the dedicated Management Console or from the node hosting the active agile management console. 1. Disable HA for all file serving nodes: ibrix_server -m -U 2.
Powering off the X9720 system hardware After shutting down the X9000 Software, power off the X9720 hardware as follows: 1. 2. 3. Power off the 9100c controllers. Power off the 9200cx disk capacity block(s). Power off the file serving nodes. The cluster is now completely shut down. Starting up the system To start a X9720 system, first power on the hardware components, and then start the X900 Software. Powering on the X9720 system hardware To power on the X9720 hardware, complete the following steps: 1.
Powering file serving nodes on or off When file serving nodes are connected to properly configured power sources, the nodes can be powered on or off or can be reset remotely. To prevent interruption of service, set up standbys for the nodes (see “Identifying standbys for file serving nodes” on page 33), and then manually fail them over before powering them off (see “Manually failing over a file serving node” on page 36). Remotely powering off a file serving node does not trigger failover.
CAUTION: Changing host tuning settings will alter file system performance. Contact HP Support before changing host tuning settings. Use the ibrix_host_tune command to list or change host tuning settings: • To list default values and valid ranges for all permitted host tunings: /bin/ibrix_host_tune -L • To tune host parameters on nodes or hostgroups: /bin/ibrix_host_tune -S {-h HOSTLIST|-g GROUPLIST} -o OPTIONLIST Contact HP Support to obtain the values for OPTIONLIST.
See the ibrix_lwhost command description in the HP StorageWorks X9000 File Serving Software CLI Reference Guide for other available options. Windows clients. Click the Tune Host tab on the Windows X9000 client GUI. Tunable parameters include the NIC to prefer (the default is the cluster interface), the communications protocol (UDP or TCP), and the number of server threads to use. See the online help for the client if necessary.
To evacuate a segment, complete the following steps: 1. Identify the segment residing on the physical volume to be removed. Select Storage from the Navigator on the management console GUI. Note the file system and segment number on the affected physical volume. In the following example, physical volume d1 is being retired. Segment 1 from file system ifs1 uses that physical volume. 2. Locate other segments on the file system that can accommodate the data being evacuated from the affected segment.
ibrix_replicate -f FSNAME -b EVACUATED_SEGNUM If you evacuated the root segment (segment 1 by default), include the -F option in the command. The segment number associated with the storage is not reused. 6. If quotas were disabled on the file system, unmount the file system and then re-enable quotas using the following command: ibrix_fs -q -E -f FSNAME Then remount the file system.
Link aggregation and virtual interfaces When creating a user network interface, you can use link aggregation to combine physical resources into a single VIF. VIFs allow you to provide many named paths within the larger physical resource, each of which can be managed and routed independently, as shown in the following diagram. See the network interface vendor documentation for any rules or restrictions required for link aggregation.
For example, to set netmask 255.255.0.0 and broadcast address 10.0.0.4 for interface eth3 on file serving node s4.hp.com: /bin/ibrix_nic -c -n eth3 -h s4.hp.com -M 255.255.0.0 -B 10.0.0.4 Preferring network interfaces After creating a user network interface for file serving nodes or X9000 clients, you will need to prefer the interface for those nodes and clients.
To unprefer a network interface for a hostgroup, use the following command: /bin/ibrix_client -n -g HOSTGROUP -A DESTHOST Making network changes This section describes how to change IP addresses, change the cluster interface, manage routing table entries, and delete a network interface.
• X9000 clients must have network connectivity to the file serving nodes that manage their data and to the standbys for those servers. This traffic can use the cluster network interface or a user network interface.
/bin/ibrix_nic -l -h HOSTLIST The following table describes the fields in the output. Field Description BACKUP HOST File serving node for the standby network interface. BACKUP-IF Standby network interface. HOST File serving node. An asterisk (*) denotes the management console. IFNAME Network interface on this file serving node. IP_ADDRESS IP address of this NIC. LINKMON Whether monitoring is on for this NIC. MAC_ADDR MAC address of this NIC.
11 Migrating to an agile management console configuration The agile management console configuration provides one active management console and one passive management console installed on different file serving nodes in the cluster. The migration procedure configures the current Management Server blade as a host for an agile management console and installs another instance of the agile management console on a file serving node.
• If you are using X9000 clients over the user bond1 network, edit the /etc/sysconfig/ network-scripts/ifcfg-bond1 file. Change the IP address to another unused, reserved IP address. Run one of the following commands: /etc/init.d/network restart service network restart If you are not at the local terminal you might have to reconnect using the new IP address.
[root@x109s1 ~]# ibrix_fm -i FusionServer: x109s1 (active, quorum is running) ================================================ Command succeeded! 8. Verify that only one management console exists in this cluster: ibrix_fm -f For example: [root@x109s1 ~]# ibrix_fm -f NAME IP ADDRESS ------ ---------X109s1 172.16.3.100 Command succeeded! 9. To provide high availability for the management console, install a passive agile management console on an existing file serving node.
# Internal Admin Addresses 172.16.3.100 x109s1 172.16.3.100 x109s1-adm.internal 172.16.3.2 x109s2 172.16.3.2 x109s2-adm.internal 13. Update all nodes in the cluster with the newly modified /etc/hosts file. In the following command, X is the number of nodes in the cluster. for i in `seq 1 X` ; do scp /etc/hosts 172.16.3.
12 Upgrading the X9000 Software This chapter describes how to upgrade to the latest X9000 File Serving Software release. The management console and all file serving nodes must be upgraded to the new release at the same time. X9000 Clients are supported for one version beyond their release. For example, an X9000 5.3.2 client can run with a 5.4 X9000 server, but not with a 5.5 X9000 server. IMPORTANT: Do not start new remote replication jobs while a cluster upgrade is in progress.
2. On the current active management console, move the /ibrix directory used in the previous release installation to ibrix.old. For example, if you expanded the tarball in /root during the previous X9000 installation on this node, the installer is in /root/ibrix. 3. On the current active management console, expand the distribution tarball or mount the distribution DVD in a directory of your choice. Expanding the tarball creates a subdirectory named ibrix that contains the installer program.
file system access to continue. This procedure cannot be used for major upgrades, but is appropriate for minor and maintenance upgrades. • Offline upgrades. This procedure requires that you first unmount file systems and stop services. (Each file serving node may need to be rebooted if NFS or CIFS causes the unmount operation to fail.) You can then perform the upgrade. Clients will experience a short interruption to file system access while each file serving node is upgraded.
Upgrading file serving nodes After the management console has been upgraded, complete the following steps on each file serving node: 1. From the management console, manually fail over the file serving node: /bin/ibrix_server -f -p -h HOSTNAME The node reboots automatically. 2. Move the /ibrix directory used in the previous release installation to ibrix.old.
3. Upgrade X9000 Clients: • For Linux clients, see Upgrading Linux X9000 clients, page 94. • For Windows clients, see Upgrading Windows X9000 clients, page 95. 4. Verify that all version indicators match for file serving nodes and X9000 Clients. Run the following command from the management console: /bin/ibrix_version –l If there is a version mismatch, run the /ibrix/ibrixupgrade -f script again on the affected node, and then recheck the versions.
2. Move the /ibrix directory used in the previous release installation to ibrix.old. For example, if you expanded the tarball in /root during the previous X9000 installation on this node, the installer is in /root/ibrix. 3. Expand the distribution tarball or mount the distribution DVD in a directory of your choice. Expanding the tarball creates a subdirectory named ibrix that contains the installer program. For example, if you expand the tarball in /root, the installer is in /root/ibrix.
Completing the upgrade 1. Remount all file systems: /bin/ibrix_mount -f -m 2. From the management console, turn automated failover back on: /bin/ibrix_server -m 3. Confirm that automated failover is enabled: /bin/ibrix_server -l In the output, HA displays on. 4. From the management console, perform a manual backup of the upgraded configuration: /bin/ibrix_fm -B 5.
Upgrading the file serving nodes hosting the management console Complete the following steps: 1. On the node hosting the active management console, force a backup of the management console configuration: /bin/ibrix_fm -B The output is stored at /usr/local/ibrix/tmp/fmbackup.zip. Be sure to save this file in a location outside of the cluster. 2. On the active management console node, disable automated failover on all file serving nodes: /bin/ibrix_server -m -U 3.
11. On the node with the active agile management console, expand the distribution tarball or mount the distribution DVD in a directory of your choice. Expanding the tarball creates a subdirectory named ibrix that contains the installer program. For example, if you expand the tarball in /root, the installer is in /root/ibrix. nl 12. Change to the installer directory if necessary and run the upgrade: .
Also run the following command, which should report that the console is passive: /bin/ibrix_fm -i 22. Check /usr/local/ibrix/log/fusionserver.log for errors. 23. If the upgrade was successful, fail back the node. Run the following command on the node with the active agile management console: /bin/ibrix_server -f -U -h HOSTNAME 24.
ipfs1 102592 0 (unused) If either grep command returns empty, contact HP Support. 7. From the management console, verify that the new version of X9000 Software FS/IAS has been installed on the file serving node: /bin/ibrix_version -l –S 8. If the upgrade was successful, failback the file serving node: /bin/ibrix_server -f -U -h HOSTNAME 9. Repeat steps 1 through 8 for each remaining file serving node in the cluster.
NOTE: To determine which node is hosting the active management console, run the following command: /bin/ibrix_fm -i Preparing for the upgrade 1. On the active management console node, disable automated failover on all file serving nodes: /bin/ibrix_server -m -U 2. Verify that automated failover is off. In the output, the HA column should display off. /bin/ibrix_server -l 3.
/etc/init.d/ibrix_fusionmanager status The status command confirms whether the correct services are running. Output will be similar to the following: Fusion Manager Daemon (pid 18748) running... 7. Check /usr/local/ibrix/log/fusionserver.log for errors. 8. Upgrade the remaining management console node. Move the ibrix directory used in the previous release to ibrix.old. Then expand the distribution tarball or mount the distribution DVD in a directory of your choice.
6. From the active management console node, verify that the new version of X9000 Software FS/IAS is installed on the file serving nodes: /bin/ibrix_version -l –S Completing the upgrade 1. Remount the X9000 Software file systems: /bin/ibrix_mount -f -m 2. From the node hosting the active management console, turn automated failover back on: /bin/ibrix_server -m 3.
The IAD service should be running, as shown in the sample output above. If it is not, contact HP Support. Upgrading Windows X9000 clients Complete the following steps on each client: 1. 2. 3. 4. 5. Remove the old Windows X9000 client software using the Add or Remove Programs utility in the Control Panel. Copy the Windows X9000 client MSI file for the upgrade to the machine. Launch the Windows Installer and follow the instructions to complete the upgrade.
To do this, launch the X9000 Software Client User Interface on the Client. Go to the Registration Tab, enter the Management Console name, select Recover Registration, and then click Register. If you are prompted to overwrite the existing registration, select yes to complete the operation.
13 Licensing This chapter describes how to view your current license terms and how to obtain and install new X9000 Software product license keys. NOTE: For MSA2000 G2 licensing (for example, snapshots), see the MSA2000 G2 documentation. Viewing license terms The X9000 Software license file is stored in the installation directory on the management console. To view the license from the management console GUI, select Cluster Configuration in the Navigator and then select License.
3. Launch the AutoPass GUI: /usr/local/ibrix/bin/fusion-license-manager 4. In the AutoPass GUI, go to Tools, select Configure Proxy, and configure your proxy settings. 5. Click Retrieve/Install License > Key and then retrieve and install your license key. If the management console does not have an Internet connection, retrieve the license from a machine that does have a connection, deliver the file with the license to the management console machine, and then use the AutoPass GUI to import the license.
14 Upgrading the X9720 Network Storage System hardware WARNING! Before performing any of the procedures in this chapter, read the important warnings, precautions, and safety information in Appendix C and Appendix E. Adding new server blades NOTE: This requires the use of the Quick Restore DVD. See Chapter 18 for more information. 1. On the front of the blade chassis, in the next available server blade bay, remove the blank. 2. Prepare the server blade for installation.
3. Install the server blade. 4. Install the software on the server blade. The Quick Restore DVD is used for this purpose. See Chapter 18 for more information. 5. Set up fail over. For more information, see the HP StorageWorks X9000 File Serving Software User Guide. 6. Enable high availability (automated failover) by running the following command on server 1: # ibrix_server –m 7. Discover storage on the server blade: ibrix_pv -a 8.
1. Power on the capacity block by first powering on the X9700cx enclosure followed by the X9700c enclosure. 2. Run the exds_stdiag command on every server to validate that the new capacity block is visible and that the correct firmware is installed. If the capacity block is not seen after running exds_stdiag, reboot the server(s). See The exds_stdiag utility on page ??? for more information on interpreting the output from exds_stdiag. 3. If necessary, update the firmware of the new capacity block.
Upgrading the X9720 Network Storage System hardware
15 Upgrading firmware IMPORTANT: The X9720 system is shipped with the correct firmware and drivers. Do not upgrade firmware or drivers unless the upgrade is recommended by HP Support or is part of an X9720 patch provided on the HP web site. Firmware update summary When the X9720 Network Storage System software is first loaded, it automatically updates the firmware for some components. The following table describes the firmware actions and status for each component.
Locating firmware Obtain the firmware by one of the following methods: • HP technical support might send you an updated mxso-firmware RPM. This installs firmware in / opt/hp/mxso/firmware. This RPM also updates the revision information used by the exds_stdiag commands. The README.txt file in the directory tells you which file belongs to which firmware. The files listed in the README.txt file are symlinks to the actual firmware file. See the following table for a list of the symlinks.
The command automatically updates both Onboard Administrators, resetting each in turn as appropriate. Upgrading all Virtual Connect modules The Virtual Connect firmware upgrade process updates all Virtual Connect modules at once. During the update any single NIC (non-bonded) interfaces will lose network connectivity during the update. NOTE: This procedure assumes that the management network is using a bonded configuration (that is, bond0 exists). If the system was originally installed with V1.
10. Stop the FTP service: # service vsftpd stop Upgrading X9700c controller firmware This firmware is only delivered in the mxso-firmware RPM. IMPORTANT: Before performing this procedure, ensure that the X9700c controllers are running normally. Use the exds_stdiag to verify that the "Path from" from all running servers is "online" for both X9700c controllers. To upgrade X9700c controller firmware: 1. Download the RPM. 2. Install on all servers. 3.
To upgrade X9700cx I/O module and disk drive firmware: 1. 2. 3. 4. 5. Download the RPM. Install on all servers. Use the exds_stdiag command to verify that all storage units are online. In particular, make sure both controllers in every X9700c chassis are online. If the path to any controller is "none," the controller might not be updated. Shut down all servers except for the first server. Shut down the first server to single user mode.
4. Copy the firmware file to the /var/ftp/pub directory. For example: # cp /opt/hp/mxso/firmware/S-2_3_2_13.img /var/ftp/pub 5. ssh to the OA using the exds user: # ssh 172.16.1.1 —l exds 6. Connect to the switch module in the bay being upgraded (bay 3 or 4): x123s-ExDS-OA1> connect interconnect 7. Log in with the same credentials as the OA. 8. Flash the new firmware. For example: => sw local flash file=ftp://172.16.3.1/pub/S-2_3_2_13.img 9.
16 Troubleshooting Managing support tickets A support ticket includes system and X9000 software information useful for analyzing performance issues and node terminations. A support ticket is created automatically if a file serving node terminates unexpectedly. You can also create a ticket manually if your cluster experiences issues that need to be investigated by HP Support.
To view a support ticket on the GUI, select Support Tickets from the Navigator. On the CLI, use the following command to view all support tickets: /bin/ibrix_supportticket -l To view details for a specific support ticket, use the following command: /bin/ibrix_supportticket -v -n When you no longer need a support ticket, you can delete it. From the GUI, select Support Ticket from the Navigator.
NOTE: During the X9000 Software installation, the names of crash dumps in the /var/crash directory change to include _PROCESSED. For example, 2010-03-08-10:09 changes to 2010-03-08-10:09_PROCESSED. NOTE: Be sure to monitor the /var/crash directory and remove any unneeded processed crash dumps. Configuring shared ssh keys To configure one-way shared ssh keys on the cluster, complete the following steps: 1. On the management console, run the following commands as root: # mkdir -p $HOME/.
When the escalate tool finishes, it generates a report and stores it in a file such as / exds_glory1_escalate.tgz.gz. Copy this file to another system and send it to HP Services. Useful utilities and processes Accessing the Onboard Administrator (OA) through the network The OA has a CLI that can be accessed using ssh. The address of the OA is automatically placed in /etc/hosts. The name is -mp.
Accessing the Onboard Administrator (OA) via service port Each OA has a service port (this is the right-most Ethernet port on the OA). This allows you to use a laptop to access the OA command line interface. See HP BladeSystem c7000 Enclosure Setup and Installation Guide for instructions on how to connect a laptop to the service port. Using hpacucli – Array Configuration Utility (ACU) The hpacucli command is a command line interface to the X9700c controllers.
[root@kudos1 ~]# exds_stdiag ExDS storage diagnostic rev 7336 Storage visible to kudos1 Wed 14 Oct 2009 14:15:33 +0000 node 7930RFCC BL460c.G6 fw I24.20090620 cpus 2 arch Intel hba 5001438004DEF5D0 P410i in 7930RFCC fw 2.00 boxes 1 disks 2 luns 1 batteries 0/cache - hba PAPWV0F9SXA00S P700m in 7930RFCC fw 5.74 boxes 0 disks 0 luns 0 batteries 0/cache switch HP.3G.SAS.BL.SWH in 4A fw 2.72 switch HP.3G.SAS.BL.SWH in 3A fw 2.72 switch HP.3G.SAS.BL.SWH in 4B fw 2.72 switch HP.3G.SAS.BL.SWH in 3B fw 2.
• Reports missing, failed, or degraded site uplinks • Reports missing or failed NICs in server blades Sample output exds_netperf The exds_netperf tool measures network performance. The tool measures performance between a client system and the X9720 Network Storage System. Run this test when the system is first installed. Where networks are working correctly, the performance results should match the expected link rate of the network, that is, for a 1– link, expect about 90 MB/s.
• On the client host, run exds_netperf in serial mode against each X9720 Network Storage System server in turn. For example, if there are two servers whose eth2 addresses are 16.123.123.1 and 16.123.123.2, use the following command: # exds_netperf --serial --server “16.123.123.1 16.123.123.2” • On a client host, run exds_netperf in parallel mode, as shown in the following example.
they fail. Failed components will be reported in the output of ibrix_vs -i, and failed storage components will be reported in the output of ibrix_health -V -i. Identifying failed I/O modules on an X9700cx chassis When an X9700cx I/O module (or the SAS cable connected to it) fails, the X9700c controller attached to the I/O module reboots and if the I/O module does not immediately recover, the X9700c controller stays halted.
1. Verify that SAS cables are connected to the correct controller and I/O module. The following diagram shows the correct wiring of the SAS cables. 1. X9700c 2. X9700cx primary I/O module (drawer 2) 3. X9700cx secondary I/O module (drawer 2) 4. X9700cx primary I/O module (drawer 1) 5.
2. Check the seven-segment display and note the following as it applies to your situation: • If the seven-segment display shows “on,” then both X9700c controllers are operational. • If the seven-segment displays shows “on” but there are path errors as described earlier in this document, then the problem could be with the SAS cables connecting the X9700c controller to the SAS Switch in the blade chassis. Replace the SAS cable and run the exds stdiag command, which should report two controllers.
7. Examine the I/O module LEDs. If an I/O module has an amber LED: a. 8. a. Detach the SAS cable connecting the I/O module to the X9700c controller. b. Ensure that the disk drawer is fully pushed in and locked. c. Remove the I/O module. d. Replace with a new I/O module (it will not engage with the disk drawer unless the drawer is fully pushed in) e. Re-attach the SAS cable. Ensure it is attached to the “IN” port (the bottom port). b.
9. If the seven-segment display now shows “on,” run the exds_stdiag command and validate that both controllers are seen by exds_stdiag. 10. If the fault has not cleared at this stage, there could be a double fault (that is, failure of two I/O modules). Alternatively, one of the SAS cables could be faulty. Contact HP Support to help identify the fault or faults. Run the exds_escalate command to generate an escalate report for use by HP Support as follows: # exds_escalate 11.
Re-seating an X9700c controller Make sure you are re-seating the correct controller. You should observe both a flashing amber LED and the seven-segment display. An H1 or C1 code indicates controller 1 (left) is halted; an H2 or C2 code indicates that controller 2 (right) should be re-seated. NOTE: There is no need to disconnect the SAS cables during this procedure. To re-seat the controller: 1. Squeeze the controller thumb latch and rotate the latch handle down 2.
Troubleshooting specific issues Software services Cannot start services on the management console, a file serving node, or a Linux X9000 client SELinux might be enabled. To determine the current state of SELinux, use the getenforce command. If it returns enforcing, disable SELinux using either of these commands: setenforce Permissive setenforce 0 To permanently disable SELinux, edit its configuration file (/etc/selinux/config) and set SELINUX=parameter to either permissive or disabled.
operations. If, however, the network connection between a client and the management console is not active, the client cannot receive the updated map, resulting in client I/O errors. To fix the problem, restore the network connection between the clients and the management console. Windows X9000 clients Logged in but getting a “Permission Denied” message The X9000 client cannot access the Active Directory server because the domain name was not specified.
1. Log on to the server. 2. Start hp-ilo: # service hp-ilo start 3. Flash the power PIC: # /opt/hp/mxso/firmware/power_pic_scexe 4. Reboot the server. ibrix_fs -c failed with "Bad magic number in super-block" If a file system creation command fails with an error like the following, it could be because the segment creation command failed to preformat the LUN.
2. Immediately run the following command: # exds_escalate This gathers log information that is useful in diagnosing whether the data can be recovered. Generally, if the failure is due to real disk failures, the data cannot be recovered. However, if the failure is due to an inadvertent removal of a working disk drive, it may be possible to restore the LUN to operation. 3. Contact HP Support as soon as possible.
11. Perform the following steps For each X9700c controller in turn: a. Slide out controller until LEDs extinguish. b. Reinsert controller. c. Wait for the seven-segment to show "on". d. Run the exds_stdiag command on affected server. e. If ok, the procedure is completed; otherwise, repeat steps a through d on next the controller. 12. If the above steps do not produce results, replace the HP P700m. 13. Boot server and run exds_stdiag, 14.
to flicker green. Note that a disk drive could be in use even if the online/activity LED is not illuminated green. IMPORTANT: Do not remove a disk drive unless the fault/UID LED is amber. See the HP StorageWorks X9720 Network Storage System Controller User Guide for more information about the LED descriptions.
Port;Name;Status;Type;Speed 1;enc0:1:X1;Linked (Active) (10);CX4;Auto 2;enc0:1:X2;Not Linked;absent;Auto 3;enc0:1:X3;Not Linked;absent;Auto 4;enc0:1:X4;Not Linked;absent;Auto 5;enc0:1:X5;Not Linked;absent;Auto 6;enc0:1:X6;Not Linked;absent;Auto -> There are 16 identical profiles assigned to servers.
To repair information on all file serving nodes, omit the -h HOSTLIST argument.
17 Replacing components in the X9720 Network Storage System Customer replaceable components WARNING! Before performing any of the procedures in this chapter, read the important warnings, precautions, and safety information in Appendix C and Appendix E. IMPORTANT: To avoid unintended consequences, HP recommends that you perform the procedures in this chapter during scheduled maintenance times.
Based on availability and where geography permits, CSR parts will be shipped for next business day delivery. Same-day or four-hour delivery might be offered at an additional charge, where geography permits. If assistance is required, you can call the HP Technical Support Center, and a technician will help you over the telephone. The materials shipped with a replacement CSR part specify whether a defective part must be returned to HP.
Required tools The following tools might be necessary for some procedures: • • • • T-10 Torx screwdriver T-15 Torx screwdriver 4-mm flat-blade screwdriver Phillips screwdriver Additional documentation The information in this section pertains to the specifics of the X9720 Network Storage System. For detailed component replacement instructions, see the following documentation at http://www.hp.
2. Transfer all of the components from the original blade system (particularly the OA modules, Virtual Connect modules, SAS switches, and blades) to the new blade enclosure. IMPORTANT: Make sure that you keep the same server blade bay numbers when moving blades from the old chassis to the new chassis. 3. Power up the blade enclosure, the blades should boot correctly.
You do not need to shut down the server blade; disk drives can be hot swapped. However, you must replace the removed drive with a drive of the same size. To replace a disk drive in the server blade: 1. Check the state of the internal logical disk drive. For more information, see the HP ProLiant BL460c Server Blade User Guide. 2. If the state is failed, use the procedure in Replacing both disk drives. 3. Remove one drive. Make sure you remove the failed drive.
1. Disconnect the network connections into the Ethernet Virtual Connect (the module in bay 1 or 2). 2. Remove the VC module. 3. Replace the VC module. 4. Reconnect the cable that was disconnected in step 1. 5. Remove and then reconnect the uplink to the customer network for bay 2. NOTE: Clients lose connectivity during this procedure unless you are using a bonded network.
6. Connect to the switch module that was not replaced: x123s-ExDS-OA1> connect interconnect <3 or 4> 7. Log in with the same credentials as the OA. 8. Run the following command to prepare for a zonegroup update: => sw local forceactive 9.
12. Disconnect from the interconnect by pressing Ctrl-Shift-_ (the Control, Shift, and underscore keys) and then press the D key for "D)isconnect". 13. Exit from the OA using "exit". NOTE: Wait at least 60 seconds after seating the SAS switch before removing another SAS switch. The SAS switch now provides a redundant access path to storage. Storage units will balance I/O between both switches on their subsequent restarts. Replacing the P700m mezzanine card 1. Shut down the server blade. 2.
It is always safe to remove a drive in the failed state. However, if the drive is in the ok or predict-fail state, do not remove it if any LUN on that array is not in the ok state (that is, if LUNs are degraded or rebuilding). This is because the RAID logic might need the data on that drive to reconstruct data for another driver that has failed. Removing a drive in this situation could break the LUN and set it to the failed state.
1. Remove the SAS cable in port 1 that connects the X9700c to the SAS switch in the c-Class blade enclosure. Do not remove the two SAS expansion cables that connect the X9700c controller to the I/O controllers on the X9700cx enclosure. 2. Slide the X9700c controller partially out of the chassiss: a. Squeeze the controller thumb latch and rotate the latch handle down. b. Pull the controller straight out of the chassis until it has clearly disengaged. 3.
1. Check the expiration date of the replacement battery spare part kit. If the battery has expired, do not use it; get another replacement battery. 2. Determine if a controller is booting by observing the seven-segment display. If a controller is booting, the display will not read “on.” IMPORTANT: Do not replace the battery while a controller is booting. 3. Remove the old battery. 4. Insert new battery. 5. Use the exds_stdiag command to verify that the battery is charging or is working properly.
Replacing the X9700c chassis You cannot replace the X9700c chassis while the system is in operation. HP recommends that you perform this operation only during a scheduled maintenance window. 1. Shut down all servers. 2. Remove power connectors to the X9700c and associated X9700cx chassis. 3. Remove all SAS cables. 4. Remove disk drives, making sure to note which drive was in which bay. 5. Remove the X9700c chassis from the rack. 6.
4. It is normal for the X9700c controller connected to the I/O module to reset itself. The seven-segment display of the X9700c enclosure will no longer show “on,” and the X9700c controller will have an amber warning light. If the I/O module had previously failed, the X9700c controller will already be in this state. 5. Remove the I/O module. 6. Replace the I/O module (it will not engage with the disk drawer unless the drawer is fully pushed in). 7. Re-attach the SAS cable.
10. Mount the file systems that were unmounted in step 1. For more information, see the HP StorageWorks X9000 File Serving Software User Guide. 11. If the system is not operating normally, repeat the entire procedure until the system is operational. See HP StorageWorks 600 Modular Disk System Maintenance and Service Guide for more information. Replacing the X9700cx power supply There are four power supplies in each X9700cx chassis—two on the left and two on the right.
18 Recovering the X9720 Network Storage System The instructions in this section are necessary in the following situations: • • • • The X9720 fails and must be recovered. A server blade is added or replaced. The dedicated X9000 management console blade fails and must be replaced. A file serving node fails and must be replaced. You will need to create a QuickRestore DVD, as described later, and then install it on the affected blade.
3. Burn the ISO image to a DVD. 4. Insert the Quick Restore DVD into a USB DVD drive cabled to the Onboard Administrator or to the Dongle connecting the drive to the front of the blade. IMPORTANT: Use an external USB drive that has external power; do not rely on the USB bus for power to drive the device. 5. Restart the server to boot from the DVD-ROM. 6. When the following screen appears, enter qr to start the recovery. The server reboots automatically after the installation is complete.
7. The Configuration Wizard starts automatically. Use the appropriate configuration procedure: • To configure a file serving node, select one of the following: • When your cluster was configured initially, the installer may have created a template for configuring file serving nodes. To use this template to configure the file serving node undergoing recovery, go to “Configuring a file serving node using the original template” on page 147.
NOTE: If the list does not include the appropriate management console, or you want to customize the cluster configuration for the file serving node, select Cancel. Go to Configuring a file serving node manually, page 151 for information about completing the configuration. 4. On the Verify Hostname dialog box, enter a hostname for the node, or accept the hostname generated by the hostname template. 5. The Verify Configuration window shows the configuration received from the management console.
NOTE: If you select Reject, the wizard will exit and the shell prompt will be displayed. You can restart the Wizard by entering the command /usr/local/ibrix/autocfg/bin/ menu_ss_wizard or logging in to the server again. 6. If the specified hostname already exists in the cluster (the name was used by the node you are replacing), the Replace Existing Server window asks whether you want to replace the existing server with the node you are configuring.
If you have configured a user network, enter a VIF IP address and netmask. If you configured a passive management console, enter the following command to verify the status of the console: ibrix_fm -i Next, complete the restore on the file serving node. Completing the restore on a file serving node 1. Ensure that you have root access to the node. The restore process sets the root password to hpinvent, the factory default. 2. Run the exds_stdiag command on the node.
4. The QuickRestore DVD enables the iptables firewall. Either make the firewall configuration match that of your other server blades to allow traffic on appropriate ports, or disable the service entirely by running the chkconfig iptables off and service iptables stop commands. To allow traffic on appropriate ports, open the following ports: 5.
3. The Configuration Wizard attempts to discover management consoles on the network and then displays the results. Select Cancel to configure the node manually. (If the wizard cannot locate a management console, the screen shown in step 4 will appear.) 4. The file serving node Configuration Menu appears.
5. The Cluster Configuration Menu lists the configuration parameters that you will need to set. Use the Up and Down arrow keys to select an item in the list. When you have made your select, press Tab to move to the buttons at the bottom of the dialog box, and press Space to go to the next dialog box. 6. Select Management Console from the menu, and enter the IP address of the management console. This is typically the address of the management console on the cluster network.
7. Select Hostname from the menu, and enter the hostname of this server. 8. Select Time Zone from the menu, and then use Up or Down to select your time zone.
9. Select Default Gateway from the menu, and enter the IP Address of the host that will be used as the default gateway. 10. Select DNS Settings from the menu, and enter the IP addresses for the primary and secondary DNS servers that will be used to resolve domain names. Also enter the DNS domain name.
11. Select NTP Servers from the menu, and enter the IP addresses or hostnames for the primary and secondary NTP servers. 12. Select Networks from the menu. Select to create a bond for the cluster network.
You are creating a bonded interface for the cluster network; select Ok on the Select Interface Type dialog box. Enter a name for the interface (bond0 for the cluster interface) and specify the appropriate options and slave devices. The factory defaults for the slave devices are eth0 and eth3.
13. When the Configure Network dialog box reappears, select bond0.
14. To complete the bond0 configuration, enter a space to select the Cluster Network role. Then enter the IP address and netmask information that the network will use. Repeat this procedure to create a bonded user network (typically bond1 with eth1 and eth2) and any custom networks as required.
15. When you have completed your entries on the File Serving Node Configuration Menu, select Continue. 16. Verify your entries on the confirmation screen, and select Commit to apply the values to the file serving node and register it with the management console. 17. If the hostname specified for the node already exists in the cluster (the name was used by the node you are replacing), the Replace Existing Server window asks whether you want to replace the existing server with the node you are configuring.
If you configured a passive management console, enter the following command to verify the status of the console: ibrix_fm -i IMPORTANT: Next, go to Completing the restore on a file serving node. Configuring the management console on the dedicated (non-agile) Management Server blade This procedure is intended for systems that are not using the agile management console configuration, and instead, the standard management console is configured on the Management Server blade.
3. You can now configure the management console. The Introduction screen describes the configuration process. 4. The Management Console Configuration Menu lists the configuration parameters that you will need to set. Do not enable the Agile Management feature.
5. Select Hostname from the menu, and enter the hostname of this server. 6. Select Time Zone from the menu, and then use Up or Down to select your time zone.
7. Select Default Gateway from the menu, and enter the IP Address of the host that will be used as the default gateway. 8. Select DNS Settings from the menu, and enter the IP addresses for your DNS servers. Also enter the DNS domain name.
9. Select NTP Servers from the menu, and enter the IP addresses or hostnames for the primary and secondary NTP servers. 10. Select Networks from the menu. You will need to create one cluster network interface, which will be used for intracluster communication. Typically this interface is configured as bond0. You may also need to create a user network, which is used for server to client communication. The user network is typically bond1. To create a bond, select .
You are creating a bonded interface for the cluster network; select Ok on the Select Interface Type dialog box. Enter a name for the interface (bond0 for the cluster interface) and specify the appropriate options and slave devices.
11. When the Configure Network dialog box reappears, select bond0.
12. To complete the bond0 configuration, enter a space to select the Cluster Network role. Then enter the IP address and netmask information that the network will use. Repeat this procedure to create a bonded user network (typically bond1) and any custom networks as required.
13. The Confirm Management Console Configuration screen lists the values you have entered for the management console. You can change the values if needed. When you select Commit, the values will be applied. Networking will be set up, and the management console software will start. After you confirm the management console configuration, a Cluster Configuration menu is displayed. This menu is used to create a template for configuring file serving nodes. You can exit this menu.
4. The QuickRestore DVD enables the iptables firewall. Either make the firewall configuration match that of your other server blades to allow traffic on appropriate ports, or disable the service entirely by running the chkconfig iptables off and service iptables stop commands. To allow traffic on appropriate ports, open the following ports: 5. • 80 • 443 • 1234 • 9000 • 9005 • 9008 • 9009 Use the AutoPass GUI to reinstall your license.
on the previously installed systems. For example, a NIC that was eth3 on a previously installed system could now be eth5. If BondsDegraded messages are reported, take the following steps: 1. From a working node in the cluster, login to the Onboard Administrator (OA). SSH or telnet into the OA at 172.16.1.1. The default login from the factory is exds and the password is hpinvent. 2. Run the following command: show server LOM 1-a and LOM 1-b MAC LOM 2-a MAC 3.
Recovering the X9720 Network Storage System
19 Support and other resources Contacting HP For worldwide technical support information, see the HP support website: http://www.hp.
Rack stability Rack stability protects personnel and equipment. WARNING! To reduce the risk of personal injury or damage to equipment: • Extend leveling jacks to the floor. • Ensure that the full weight of the rack rests on the leveling jacks. • Install stabilizing feet on the rack. • In multiple-rack installations, fasten racks together securely. • Extend only one rack component at a time. Racks can become unstable if more than one component is extended.
A Component and cabling diagrams Base and expansion cabinets A minimum X9720 Network Storage System base cabinet has from 3 to 16 performance blocks (that is, server blades) and from 1 to 4 capacity blocks. An expansion cabinet can support up to four more capacity blocks, bringing the system to eight capacity blocks. The first server blade is configured as the management console. The other servers are configured as file serving nodes.
Back view of a base cabinet with one capacity block 1. Management switch 2 2. Management switch 1 3. X9700c 1 4. TFT monitor and keyboard 5. c-Class Blade enclosure 6.
Front view of a full base cabinet 1 X9700c 4 6 X9700cx 3 2 X9700c 3 7 TFT monitor and keyboard 3 X9700c 2 8 c-Class Blade Enclosure 4 X9700c 1 9 X9700cx 2 5 X9700cx 4 10 X9700cx 1 X9720 Network Storage System Administrator Guide 177
Back view of a full base cabinet 1 Management switch 2 7 X9700cx 4 2 Management switch 1 8 X9700cx 3 3 X9700c 4 9 TFT monitor and keyboard 4 X9700c 3 10 c-Class Blade Enclosure 5 X9700c 2 11 X9700cx 2 6 X9700c 1 12 X9700cx 1 178 Component and cabling diagrams
Front view of an expansion cabinet The optional X9700 expansion cabinet can contain from one to four capacity blocks. The following diagram shows a front view of an expansion cabinet with four capacity blocks. 1. X9700c 8 5. X9700cx 8 2. X9700c 7 6. X9700cx 7 3. X9700c 6 7. X9700cx 6 4. X9700c 5 8.
Back view of an expansion cabinet with four capacity blocks 1. X9700c 8 5. X9700cx 8 2. X9700c 7 6. X9700cx 7 3. X9700c 6 7. X9700cx 6 4. X9700c 5 8. X9700cx 5 Performance blocks (c-Class Blade enclosure) A performance block is a special server blade for the X9720. Server blades are numbered according to their bay number in the blade enclosure. Server 1 is in bay 1 in the blade enclosure, and so on. Server blades must be contiguous; empty blade bays are not allowed between server blades.
The following diagram shows a front view of a performance block (c-Class Blade enclosure) with half-height device bays numbering 1 through 16. Front view of a c-Class Blade enclosure The following diagram shows a front view of a performance block (c-Class Blade enclosure) with half-height device bays numbering 1 through 16. Rear view of a c-Class Blade enclosure 1. Interconnect bay 1 (Virtual Connect Flex10 10 Ethernet Module) 6. Interconnect bay 6 (reserved for future use) 2.
Flex-10 networks The server blades in the X9720 Network Storage System have two built-in Flex-10 10 NICs. The Flex-10 technology comprises the Flex-10 NICs and the Flex-10 Virtual Connect modules in interconnect bays 1 and 2 of the performance chassis. Each Flex-10 NIC is configured to represent four physical interfaces (NIC) devices, also called FlexNICs, with a total bandwidth of 10ps.
1. Box 1—X9700c 2. Box 2—X9700cx, left drawer (as viewed from the front) 3. Box 3—X9700cx, right drawer (as viewed from the front) An array normally has two controllers. Each controller has a battery-backed cache. Each controller has its own firmware. Normally all servers should have two redundant paths to all arrays. X9700c (array controller with 12 disk drives) Front view of an X9700c 1. Bay 1 5. Power LED 2. Bay 2 6. System fault LED 3. Bay 3 7. UID LED 4. Bay 4 8.
2. Battery 2 10. X9700c controller 2 3. SAS expander port 1 11. SAS expander port 2 4. UID 12. SAS port 1 5. Power LED 13. X9700c controller 1 6. System fault LED 14. Fan 1 7. On/Off power button 15. Power supply 1 8. Power supply 2 X9700cx (dense JBOD with 70 disk drives) NOTE: This component is also known as the HP StorageWorks 600 Modular Disk System. For an explanation of the LEDs and buttons on this component, see the HP StorageWorks 600 Modular Disk System User Guide at http://www.hp.
Rear view of an X9700cx 1. Power supply 5. In SAS port 2. Primary I/O module drawer 2 6. Secondary I/O module drawer 1 3. Primary I/O module drawer 1 7. Secondary I/O module drawer 2 4. Out SAS port 8. Fan Cabling diagrams Capacity block cabling—Base and expansion cabinets A capacity block is comprised of the X9700c and X9700cx. CAUTION: Correct cabling of the capacity block is critical for proper X9720 Network Storage System operation.
3 X9700cx secondary I/O module (drawer 2) 4 X9700cx primary I/O module (drawer 1) 5 X9700cx secondary I/O module (drawer 1) Virtual Connect Flex-10 Ethernet module cabling—Base cabinet Site network Onboard Administrator Available uplink port 1. Management switch 2 7. Bay 5 (reserved for future use) 2. Management switch 1 8. Bay 6 (reserved for future use) 3. Bay 1 (Virtual Connect Flex-10 10 Ethernet Module for connection to site network) 9. Bay 7 (reserved for optional components) 4.
SAS switch cabling—Base cabinet NOTE: Callouts 1 through 3 indicate additional X9700c components. 1 X9700c 4 2 X9700c 3 3 X9700c 2 4 X9700c 1 5 SAS switch ports 1through 4 (in interconnect bay 3 of the c-Class Blade Enclosure). Ports 2 through 4 are reserved for additional capacity blocks. 6 SAS switch ports 5 through 8 (in interconnect bay 3 of the c-Class Blade Enclosure). Reserved for expansion cabinet use.
SAS switch cabling—Expansion cabinet NOTE: Callouts 1 through 3 indicate additional X9700c components. 1 X9700c 8 5 SAS switch ports 1 through 4 (in interconnect bay 3 of the c-Class Blade Enclosure). Used by base cabinet. 2 X9700c 7 6 SAS switch ports 5 through 8 (in interconnect bay 3 of the c-Class Blade Enclosure). 3 X9700c 6 7 SAS switch ports 1 through 4 (in interconnect bay 4 of the c-Class Blade Enclosure).
B Spare parts list Replacing components in the HP ExDS9100 Storage System explained how to replace some of the X9720 Network Storage System components. The following tables list spare parts (both customer replaceable and non customer replaceable) for the X9720 Network Storage System components. Spare parts are categorized as follows: • Mandatory. Parts for which customer self repair is mandatory. If you ask HP to replace these parts, you will be charged for the travel and labor costs of this service.
Description Spare part number Customer self repair SPS-SPS-STICK,ATTACH'D CBL,C13 0-1FT 419595-001 Mandatory SPS-RACK,BUS BAR & Wire Tray 457015-001 Optional AW552A—X9700 Expansion Rack Description Spare part number Customer self repair SPS-RACK,BUS BAR & WIRE TRAY 457015-001 Optional SPS-STABLIZER,600MM,10GK2 385973-001 Mandatory SPS-PANEL,SIDE,10642,10KG2 385971-001 Mandatory SPS-STICK,4X FIXED,C-13,OFFSET,WW 483915-001 Optional SPS-BRACKETS,PDU 252641-001 Optional SPS-SPS-STICK
Description SPS-SFP,1,VC,RJ-45 Spare part number Customer self repair 453578-001 Optional AW550A—X9700 Blade Server Description Spare part number Customer self repair SPS-CAGE, HDD, W/BEZEL 531228-001 Mandatory SPS-PLASTICS/HARDWARE, MISC 531223-001 Mandatory SPS-BACKPLANE, HDD, SAS 531225-001 Mandatory SPS-PROC,NEHALEM EP 2.
Description Spare part number Customer self repair SPS-BD,CONTROLLER,9100C 489833-001 Optional SPS-BD,7-SEGMENT,DISPLAY 399057-001 Optional SPS-BD,DIMM,DDR2,MOD,512MB 398645-001 Mandatory SPS-HDD, B/P, W/CABLES & DRAWER ASSY 455976-001 No SPS-BD,LED PANEL,W/CABLE 455979-001 Optional SPS-FAN, SYSTEM 413996-001 Mandatory SPS-PWR SUPPLY 441830-001 Mandatory SPS-POWER BLOCK,W/POWER B/P BDS 455974-001 Optional SPS-BD, 2 PORT, W/1.
C Warnings and precautions Electrostatic discharge information See Electrostatic discharge. Grounding methods There are several methods for grounding. Use one or more of the following methods when handling or installing electrostatic sensitive parts: • Use a wrist strap connected by a ground cord to a grounded workstation or computer chassis. Wrist straps are flexible straps with a minimum of 1 megohm ±10 percent resistance in the ground cords.
WARNING! Any RJ-45 receptacle marked with these symbols indicates a network interface connection. To reduce the risk of electrical shock, fire, or damage to the equipment, do not plug telephone or telecommunications connectors into this receptacle. WARNING! Any surface or area of the equipment marked with these symbols indicates the presence of a hot surface or hot component. Contact with this surface could result in injury.
Rack warnings and precautions Ensure that precautions have been taken to provide for rack stability and safety. It is important to follow these precautions providing for rack stability and safety, and to protect both personnel and property. Follow all cautions and warnings included in the installation instructions. WARNING! To reduce the risk of personal injury or damage to the equipment: • Observe local occupational safety requirements and guidelines for heavy equipment handling.
Device warnings and precautions WARNING! To reduce the risk of electric shock or damage to the equipment: • Allow the product to cool before removing covers and touching internal components. • Do not disable the power cord grounding plug. The grounding plug is an important safety feature. • Plug the power cord into a grounded (earthed) electrical outlet that is easily accessible at all times. • Disconnect power from the device by unplugging the power cord from either the electrical outlet or the device.
CAUTION: To properly ventilate the system, you must provide at least 7.6 centimeters (3.0 inches) of clearance at the front and back of the device. CAUTION: When replacing hot-pluggable components in an operational X9720 Network Storage System, allow approximately 30 seconds between removing the failed component and installing the replacement. This time is needed to ensure that configuration data about the removed component is cleared from the system registry.
Warnings and precautions
D Regulatory compliance and safety Regulatory compliance identification numbers For the purpose of regulatory compliance certifications and identification, this product has been assigned a unique regulatory model number. The regulatory model number can be found on the product nameplate label, along with all required approval markings and information. When requesting compliance information for this product, always refer to this regulatory model number.
• Reorient or relocate the receiving antenna. • Increase the separation between the equipment and receiver. • Connect the equipment into an outlet on a circuit different from that to which the receiver is connected. • Consult the dealer or an experienced radio or television technician for help. Declaration of conformity for products marked with the FCC logo, United States only This device complies with Part 15 of the FCC Rules.
WARNING! Use of controls or adjustments, or performance of procedures other than those specified herein, or in the laser product's installation guide, could result in hazardous radiation exposure. To reduce the risk of exposure to hazardous radiation: • Do not try to open the module enclosure. There are no user-serviceable components inside. • Do not operate controls, make adjustments, or perform procedures to the laser device, other than those specified herein.
BSMI notice Japanese notice Korean notice (A&B) Class A equipment Class B equipment 202 Regulatory compliance and safety
Safety Battery Replacement notice WARNING! The computer contains an internal lithium manganese dioxide, a vanadium pentoxide, or an alkaline battery pack. A risk of fire and burns exists if the battery pack is not properly handled. To reduce the risk of personal injury: • Do not attempt to recharge the battery. • Do not expose the battery to temperatures higher than 60˚C (140˚F). • Do not disassemble, crush, puncture, short external contacts, or dispose of in fire or water.
Japanese Power Cord notice Electrostatic discharge To prevent damage to the system, be aware of the precautions you need to follow when setting up the system or handling parts. A discharge of static electricity from a finger or other conductor could damage system boards or other static-sensitive devices. This type of damage could reduce the life expectancy of the device.
Waste Electrical and Electronic Equipment directive Czechoslovakian notice Danish notice Dutch notice X9720 Network Storage System Administrator Guide 205
English notice Estonian notice Finnish notice 206 Regulatory compliance and safety
French notice German notice Greek notice X9720 Network Storage System Administrator Guide 207
Hungarian notice Italian notice Latvian notice 208 Regulatory compliance and safety
Lithuanian notice Polish notice Portuguese notice X9720 Network Storage System Administrator Guide 209
Slovakian notice Slovenian notice Spanish notice 210 Regulatory compliance and safety
Swedish notice X9720 Network Storage System Administrator Guide 211
Regulatory compliance and safety
Glossary ACE Access control entry. ACL Access control list. ADS Active Directory Service. ALB Advanced load balancing. BMC Baseboard Management Configuration. CIFS Common Internet File System. The protocol used in Windows environments for shared folders. CLI Command-line interface. An interface comprised of various commands which are used to control operating system responses. CSR Customer self repair. DAS Direct attach storage.
MTU Maximum Transmission Unit. NAS Network attached storage. NFS Network file system. The protocol used in most UNIX environments to share folders or mounts. NIC Network interface card. A device that handles communication between a device and other devices on a network. NTP Network Time Protocol. A protocol that enables the storage system’s time and date to be obtained from a network-attached server, keeping multiple hosts and storage devices synchronized. OA HP Onboard Administrator.
WWNN World wide node name. A globally unique 64-bit identifier assigned to each Fibre Channel node process. WWPN World wide port name. A unique 64-bit address used in a FC storage network to identify each device in a FC network.
Glossary
Index A ACU using hpacucli, 113 adding capacity blocks, 100 server blades, 99 agile management console, 31 Array Configuration Utility using hpacucli, 113 AutoPass, 97 B backups file systems, 49 management console configuration, 49 NDMP applications, 49 blade enclosure replacing, 133 booting server blades, 15 booting X9720, 15 C cabling diagrams, 185 capacity block firmware, 106 overview, 182 capacity blocks adding, 100 removing, 101 replacing disk drive, 138 Class A equipment, 199 Class B equipment, 199
events, cluster add SNMPv3 users and groups, 47 configure email notification, 43 configure SNMP agent, 45 configure SNMP notification, 44 configure SNMP trapsinks, 46 define MIB views, 47 delete SNMP configuration elements, 48 enable or disable email notification, 44 list email notification settings, 44 list SNMP configuration, 48 monitor, 58 remove, 59 view, 58 exds escalate command, 111 exds_netdiag, 114 exds_netperf, 115 exds_stdiag utility, 113 F FCC logo, 200 features X9720, 11 file serving node recov
High Availability agile management console, 31 automated failover, turn on or off, 36 check configuration, 41 defined, 32 delete network interface monitors, 38 delete network interface standbys, 39 delete power sources, 35 detailed configuration report, 42 dissociate power sources, 35 fail back a node, 36 failover a node manually, 36 failover protection, 12 HBA monitoring, turn on or off, 40 identify network interface monitors, 38 identify network interface standbys, 38 identify standby-paired HBA ports, 40
Onboard Administrator accessing via serial port, 112 accessing via service port, 113 firmware update, 104 replacing, 135 P P700m mezzanine card replacing, 138 passwords, change GUI password, 21 POST error messages, 116 Q QuickRestoreDVD, 145 R rack stability warning, 174 regulatory compliance, 199 related documentation, 173 removing capacity blocks, 101 server blades, 101 replacing blade enclosure, 133 capacity blocks disk drive, 138 components, 131 OA, 135 Onboard Administrator, 135 P700m mezzanine card
user network interface add, 71 configuration rules, 74 defined, 71 identify for X9000 clients, 72 modify, 72 prefer, 73 unprefer, 73 V VC module firmware update, 105 replacing, 135 Virtual Connect domain, configure, 128 Virtual Connect module firmware update, 105 replacing, 135 W warning rack stability, 174 warnings loading rack, 195 weight, 194 Waste Electrical and Electronic Equipment directive, 205 websites customer self repair, 174 HP, 174 HP Subscriber's Choice for Business, 174 product manuals, 173