6.0 HP X9720 Network Storage System Administrator Guide (AW549-96034, December 2011)

ManualsBrandsHP ManualsStorageHP X9320 IB 43.2TB Network Storage System

HP X9720 Network Storage System

Administrator Guide

Abstract

This guide describes tasks related to cluster configuration and monitoring, system upgrade and recovery, hardware component

replacement, and troubleshooting. It does not document X9000 file system features or standard Linux administrative tools and

commands. For information about configuring and using X9000 Software file system features, see the HP X9000 File Serving

Software File System User Guide.

This guide is intended for system administrators and technicians who are experienced with installing and administering networks,

and with performing Linux operating and administrative tasks. For the latest X9000 guides, browse to http://www.hp.com/

support/manuals. In the storage section, select NAS Systems and then select HP X9000 Network Storage Systems from the

IBRIX Storage Systems section.

HP Part Number: AW549-96034

Published: December 2011

Edition: Ninth

Summary of content (201 pages)

PAGE 1
HP X9720 Network Storage System Administrator Guide Abstract This guide describes tasks related to cluster configuration and monitoring, system upgrade and recovery, hardware component replacement, and troubleshooting. It does not document X9000 file system features or standard Linux administrative tools and commands. For information about configuring and using X9000 Software file system features, see the HP X9000 File Serving Software File System User Guide.
PAGE 2
© Copyright 2009, 2011 Hewlett-Packard Development Company, L.P. Confidential computer software. Valid license from HP required for possession, use or copying. Consistent with FAR 12.211 and 12.212, Commercial Computer Software, Computer Software Documentation, and Technical Data for Commercial Items are licensed to the U.S. Government under vendor's standard commercial license. The information contained herein is subject to change without notice.
PAGE 3
Contents 1 Product description...................................................................................11 HP X9720 Network Storage System features...............................................................................11 System components.................................................................................................................11 HP X9000 Software features....................................................................................................
PAGE 4
Manually failing over a file serving node..............................................................................31 Failing back a file serving node...........................................................................................32 Using network interface monitoring......................................................................................32 Setting up HBA monitoring..................................................................................................
PAGE 5
Monitoring cluster health.........................................................................................................53 Health checks....................................................................................................................53 Health check reports..........................................................................................................53 Viewing logs..........................................................................................................
PAGE 6
Viewing network interface information..................................................................................77 11 Migrating to an agile management console configuration.............................78 Backing up the configuration....................................................................................................78 Performing the migration..........................................................................................................
PAGE 7
Adding/deleting commands or logs in the XML file..............................................................110 General troubleshooting steps................................................................................................110 Escalating issues...................................................................................................................110 Useful utilities and processes..................................................................................................
PAGE 8
Replacing the SAS switch in Bay 3 or 4..............................................................................131 Replacing the P700m mezzanine card................................................................................132 Replacing capacity block parts...............................................................................................132 Replacing capacity block hard disk drive............................................................................
PAGE 9
B Spare parts list ......................................................................................168 AW548A—Base Rack...........................................................................................................168 AW552A—X9700 Expansion Rack.........................................................................................168 AW549A—X9700 Server Chassis..........................................................................................169 AW550A—X9700 Blade Server .........
PAGE 10
Danish recycling notice.....................................................................................................182 Dutch recycling notice.......................................................................................................183 Estonian recycling notice...................................................................................................183 Finnish recycling notice.....................................................................................................
PAGE 11
1 Product description HP X9720 Network Storage System is a scalable, network-attached storage (NAS) product. The system combines HP X9000 File Serving Software with HP server and storage hardware to create a cluster of file serving nodes.
PAGE 12
of multiple components, and a centralized management interface. X9000 Software can scale to thousands of nodes. Based on a Segmented File System architecture, X9000 Software integrates I/O and storage systems into a single clustered environment that can be shared across multiple applications and managed from a single central management console.
PAGE 13
2 Getting started This chapter describes how to log into the system, how to boot the system and individual server blades, how to change passwords, and how to back up the management console configuration. It also describes the management interfaces provided with X9000 Software.
PAGE 14
• Quotas. Configure user, group, and directory tree quotas as needed. • Remote replication. Use this feature to replicate changes in a source file system on one cluster to a target file system on either the same cluster or a second cluster. • Data replication and validation. Use this feature to manage WORM and retained files. • X9000 software snapshots. This feature is included in the X9000 software and can be used to take scheduled or on-demand software snapshots of a file system.
PAGE 15
3. To power on the remaining server blades, run the command: ibrix_server -P on -h NOTE: Alternatively, press the power button on all of the remaining servers. There is no need to wait for the first server blade to boot. Management interfaces Cluster operations are managed through the X9000 Software management console, which provides both a GUI and a CLI. Most operations can be performed from either the GUI or the CLI.
PAGE 16
The GUI dashboard opens in the same browser window. You can open multiple GUI windows as necessary. See the online help for information about all GUI displays and operations. The GUI dashboard enables you to monitor the entire cluster. There are three parts to the dashboard: System Status, Cluster Overview, and the Navigator.
PAGE 17
System Status The System Status section lists the number of cluster events that have occurred in the last 24 hours. There are three types of events: Alerts. Disruptive events that can result in loss of access to file system data. Examples are a segment that is unavailable or a server that cannot be accessed. Warnings. Potentially disruptive conditions where file system access is not lost, but if the situation is not addressed, it can escalate to an alert condition.
PAGE 18
Navigator The Navigator appears on the left side of the window and displays the cluster hierarchy. You can use the Navigator to drill down in the cluster configuration to add, view, or change cluster objects such as file systems or storage, and to initiate or view tasks such as snapshots or replication. When you select an object, a details page shows a summary for that object. The lower Navigator allows you to view details for the selected object, or to initiate a task.
PAGE 19
Adding user accounts for GUI access X9000 Software supports administrative and user roles. When users log in under the administrative role, they can configure the cluster and initiate operations such as remote replication or snapshots. When users log in under the user role, they can view the cluster configuration and status, but cannot make configuration changes or initiate operations. The default administrative user name is ibrix. The default regular username is ibrixuser.
PAGE 20
Configuring ports for a firewall IMPORTANT: To avoid unintended consequences, HP recommends that you configure the firewall during scheduled maintenance times. When configuring a firewall, you should be aware of the following: • SELinux should be disabled. • By default, NFS uses random port numbers for operations such as mounting and locking. These ports must be fixed so that they can be listed as exceptions in a firewall configuration file. For example, you will need to lock specific ports for rpc.
PAGE 21
Port Description 9000:9200/udp 20/tcp, 20/udp Between file serving nodes and FTP clients (user network) 21/tcp, 21/udp 7777/tcp Between X9000 management console GUI and clients that need to access the GUI 8080/tcp 5555/tcp, 5555/udp Dataprotector 631/tcp, 631/udp Internet Printing Protocol (IPP) Configuring NTP servers When the cluster is initially set up, primary and secondary NTP servers are configured to provide time synchronization with an external time source.
PAGE 22
http://www.hp.com/go/insightremoteadvanced-docs For IRSS documentation, see the following page: http://www.hp.com/go/insightremotestandard-docs Limitations Note the following: • For X9000 systems, the HP Insight Remote Support implementation is limited to hardware events. • The MDS600 storage device on X9720 systems is not supported for HP Insight Remote Support. • Some manual configurations require that the X9320 and X9300 nodes be recognized as a X9000 solution.
PAGE 23
IMPORTANT: The /opt/hp/hp-snmp-agents/cma.conf file controls certain actions of the SNMP agents. You can add a trapIf entry to the file to configure the IP address used by the SNMP daemon when sending traps. For example, to send traps using the IP address of the eth1 interface, add the following: trapIf eth1 Then restart the HP SNMP agents: service hp-snmp-agents restart For more information about the cma.conf file, see Section 3.
PAGE 24
NOTE: If you are using IRSS, see “Using the HP Insight Remote Support Configuration Wizard” and “Editing Managed Systems to Complete Configuration” in the HP Insight Remote Support Standard A.05.50 Hosting Device Configuration Guide. If you are using IRSA, see "Using the Remote Support Setting Tab to Update Your Client and CMS Information” and “Adding Individual Managed Systems” in the HP Insight Remote Support Advanced A.05.50 Operations Guide.
PAGE 25
3 Configuring virtual interfaces for client access X9000 Software uses a cluster network interface to carry management console traffic and traffic between file serving nodes. This network is configured as bond0 when the cluster is installed. For clusters with an agile management console configuration, a virtual interface is also created for the cluster network interface to provide failover support for the console.
PAGE 26
# # # # ibrix_nic ibrix_nic ibrix_nic ibrix_nic –b –b –b –b –H –H –H –H node1/bond1:1,node2/bond1:2 node2/bond1:1,node1/bond1:2 node3/bond1:1,node4/bond1:2 node4/bond1:1,node3/bond1:2 Configuring NIC failover NIC monitoring should be configured on VIFs that will be used by NFS, CIFS, FTP, or HTTP. Use the same backup pairs that you used when configuring standby servers.
PAGE 27
HTTP. When you create a virtual host on the Create Vhost dialog box or with the ibrix_httpvhost command, specify the VIF as the IP address that clients should use to access shares associated with the Vhost. X9000 clients. Use the following command to prefer the appropriate user network. Execute the command once for each destination host that the client should contact using the specified interface. ibrix_client -n -h SRCHOST -A DESTNOST/IFNAME For example: ibrix_client -n -h client12.mycompany.
PAGE 28
4 Configuring failover This chapter describes how to configure failover for agile management consoles, file serving nodes, network interfaces, and HBAs. Agile management consoles The management console maintains the cluster configuration and provides graphical and command-line user interfaces for managing and monitoring the cluster. The agile management console is installed on all file serving nodes when the cluster is installed.
PAGE 29
Failing over the management console manually To fail over the active management console manually, place the console into maintenance mode. Enter the following command on the node hosting the console: ibrix_fm -m maintenance The command takes effect immediately. The failed-over management console remains in maintenance mode until it is moved to passive mode using the following command: ibrix_fm -m passive A management console cannot be moved from maintenance mode to active mode.
PAGE 30
1. 2. 3. The management console verifies that the standby is powered on and accessible. The management console migrates ownership of the node’s segments to the standby and notifies all file serving nodes and X9000 clients about the migration. This is a persistent change. If network interface monitoring has been set up, the management console activates the standby user network interface and transfers the IP address of the node’s user network interface to it.
PAGE 31
Preliminary configuration The following configuration steps are required when setting up integrated power sources: • If you plan to implement automated failover, ensure that the management console has LAN access to the power sources. • Install the environment and any drivers and utilities, as specified by the vendor documentation. If you plan to protect access to the power sources, set up the UID and password to be used.
PAGE 32
Manual failover does not require the use of programmable power supplies. However, if you have installed and identified power supplies for file serving nodes, you can power down a server before manually failing it over. You can fail over a file serving node manually, even when automated failover is turned on. A file serving node can be failed over from the GUI or the CLI. On the CLI, complete the following steps: 1. Run ibrix_server -f, specifying the node to be failed over in the HOSTNAME option.
PAGE 33
problems if the cluster interface fails.) There is no difference in the way that monitoring is set up for the cluster interface and a user network interface. In both cases, you set up file serving nodes to monitor each other over the interface. Sample scenario The following diagram illustrates a monitoring and failover scenario in which a 1:1 standby relationship is configured. Each standby pair is also a network interface monitoring pair.
PAGE 34
Setting up a monitor File serving node failover pairs can be identified as network interface monitors for each other. Because the monitoring must be declared in both directions, this is a two-pass process for each failover pair. To set up a network interface monitor, use the following command: /bin/ibrix_nic -m -h MONHOST -A DESTHOST/IFNAME For example, to set up file serving node s2.hp.com to monitor file serving node s1.hp.
PAGE 35
Discovering HBAs You must discover HBAs before you set up HBA monitoring, when you replace an HBA, and when you add a new HBA to the cluster. Discovery informs the configuration database of only a port’s WWPN. You must identify ports that are teamed as standby pairs.
PAGE 36
The following table describes the fields in the output. Field Description Host Server on which the HBA is installed. Node WWN This HBA’s WWNN. Port WWN This HBA’s WWPN. Port State Operational state of the port. Backup Port WWN WWPN of the standby port for this port (standby-paired HBAs only). Monitoring Whether HBA monitoring is enabled for this port.
PAGE 37
Viewing a detailed report Execute the ibrix_haconfig -i command to view the detailed report: /bin/ibrix_haconfig -i [-h HOSTLIST] [-f] [-b] [-s] [-v] The -h HOSTLIST option lists the nodes to check. To also check standbys, include the -b option. To view results only for file serving nodes that failed a check, include the -f argument. The -s option expands the report to include information about the file system and its segments.
PAGE 38
5 Configuring cluster event notification Cluster events You can be notified of cluster events by email or SNMP traps. To view the list of supported events, use the command ibrix_event –q. There are three major categories for events, depending on ttheir severity. Alerts. Disruptive events that can result in loss of access to file system data. Warnings. Potentially disruptive conditions where file system access is not lost, but if the situation is not addressed, it can escalate to an alert condition.
PAGE 39
The notification threshold for Alert events is 90% of capacity. Threshold-triggered notifications are sent when a monitored system resource exceeds the threshold and are reset when the resource utilization dips 10% below the threshold. For example, a notification is sent the first time usage reaches 90% or more. The next notice is sent only if the usage declines to 80% or less (event is reset), and subsequently rises again to 90% or above.
PAGE 40
Viewing email notification settings The ibrix_event command provides comprehensive information about email settings and configured notifications. /bin/ibrix_event -L Sample output follows: Email Notification SMTP Server From Reply To : : : : Enabled mail.hp.com FM@hp.com MIS@hp.com EVENT ------------------------------------asyncrep.completed asyncrep.failed LEVEL ----ALERT ALERT TYPE ----EMAIL EMAIL DESTINATION ----------admin@hp.com admin@hp.
PAGE 41
Some SNMP parameters and the SNMP default port are the same, regardless of SNMP version. The agent port is 5061 by default. SYSCONTACT, SYSNAME, and SYSLOCATION are optional MIB-II agent parameters that have no default values. The -c and -s options are also common to all SNMP versions. The -c option turns the encryption of community names and passwords on or off. There is no encryption by default.
PAGE 42
Associating events and trapsinks Associating events with trapsinks is similar to associating events with email recipients, except that you specify the host name or IP address of the trapsink instead of an email address. Use the ibrix_event command to associate SNMP events with trapsinks. The format is: /bin/ibrix_event -c -y SNMP [-e ALERT|INFO|EVENTLIST] -m TRAPSINK For example, to associate all Alert events and two Info events with a trapsink at IP address 192.168.2.
PAGE 43
For example, to create the group group2 to require authorization, no encryption, and read access to the hp view, enter: ibrix_snmpgroup -c -g group2 -s authNoPriv -r hp The format to create a user and add that user to a group follows: ibrix_snmpuser -c -n USERNAME -g GROUPNAME [-j {MD5|SHA}] [-k AUTHORIZATION_PASSWORD] [-y {DES|AES}] [-z PRIVACY_PASSWORD] Authentication and privacy settings are optional.
PAGE 44
6 Configuring system backups Backing up the management console configuration The management console configuration is automatically backed up whenever the cluster configuration changes. The backup takes place on the node hosting the active management console. The backup file is stored at /tmp/fmbackup.zip on the machine where it was created. In an agile configuration, the active management console notifies the passive management console when a new backup file is available.
PAGE 45
Configuring NDMP parameters on the cluster Certain NDMP parameters must be configured to enable communications between the DMA and the NDMP Servers in the cluster. To configure the parameters on the management console GUI, select Cluster Configuration from the Navigator, and then select NDMP Backup. The NDMP Configuration Summary shows the default values for the parameters. Click Modify to configure the parameters for your cluster on the Configure NDMP dialog box.
PAGE 46
To cancel a session, select that session and click Cancel Session. Canceling a session kills all spawned sessions processes and frees their resources if necessary. To see similar information for completed sessions, select NDMP Backup > Session History. To view active sessions from the CLI, use the following command: ibrix_ndmpsession –l To view completed sessions, use the following command. The -t option restricts the history to sessions occurring on or before the specified date.
PAGE 47
NDMP events An NDMP Server can generate three types of events: INFO, WARN, and ALERT. These events are displayed on the management console GUI and can be viewed with the ibrix_event command. INFO events. These events specify when major NDMP operations start and finish, and also report progress. For example: 7012:Level 3 backup of /mnt/ibfs7 finished at Sat Nov 7 21:20:58 PST 2009 7013:Total Bytes = 38274665923, Average throughput = 236600391 bytes/sec. WARN events.
PAGE 48
7 Creating hostgroups for X9000 clients A hostgroup is a named set of X9000 clients. Hostgroups provide a convenient way to centrally manage clients using the management console. You can put different sets of clients into hostgroups and then perform the following operations on all members of the group: • Create and delete mountpoints • Mount file systems • Prefer a network interface • Tune host parameters • Set allocation policies Hostgroups are optional.
PAGE 49
To set up one level of hostgroups beneath the root, simply create the new hostgroups. You do not need to declare that the root node is the parent. To set up lower levels of hostgroups, declare a parent element for hostgroups. Optionally, you can specify a domain rule for a hostgroup. Use only alphanumeric characters and the underscore character (_) in hostgroup names. Do not use a host name as a group name. To create a hostgroup tree using the CLI: 1.
PAGE 50
Deleting hostgroups When you delete a hostgroup, its members are assigned to the parent of the deleted group.
PAGE 51
8 Monitoring cluster operations Monitoring the X9720 Network Storage System status The X9720 storage monitoring function gathers X9720 system status information and generates a monitoring report. The X9000 management console displays status information on the dashboard. This section describes how to the use the CLI to view this information. Monitoring intervals The monitoring interval is set by default to 15 minutes (900 seconds).
PAGE 52
State Description Down-FailedOver: Server is powered down or inaccessible to the management console, and failover is complete. Down: Server is powered down or inaccessible to the management console, and no standby server is providing access to the server’s segments. The STATE field also reports the status of monitored NICs and HBAs. If you have multiple HBAs and NICs and some of them are down, the state will be reported as HBAsDown or NicsDown.
PAGE 53
Monitoring cluster health To monitor the functional health of file serving nodes and X9000 clients, execute the ibrix_health command. This command checks host performance in several functional areas and provides either a summary or a detailed report of the results. Health checks The ibrix_health command runs these health checks on file serving nodes: • Pings remote file serving nodes that share a network with the test hosts.
PAGE 54
/bin/ibrix_health -l -h i080,lab13-116 Sample output follows: PASSED --------------- Host Summary Results --------------Host Result Type State Last Update ========= ====== ====== ===== ============================ i080 PASSED Server Up Mon Apr 09 16:45:03 EDT 2007 lab13-116 PASSED Client Up Mon Apr 09 16:07:22 EDT 2007 Viewing a detailed health report To view a detailed health report, use the ibrix_health -i command: /bin/ibrix_health -i -h HOSTLIST [-f] [-s] [-v] The
PAGE 55
Physical Physical Physical Physical Physical volume volume volume volume volume 7DRzC8-ucwo-p3D2-c89r-nwZD-E1ju-61VMw9 YipmIK-9WFE-tDpV-srtY-PoN7-9m23-r3Z9Gm ansHXO-0zAL-K058-eEnZ-36ov-Pku2-Bz4WKs oGt3qi-ybeC-E42f-vLg0-1GIF-My3H-3QhN0n wzXSW3-2pxY-1ayt-2lkG-4yIH-fMez-QHfbgg readable readable readable readable readable PASSED PASSED PASSED PASSED PASSED /dev/sda /dev/sdb /dev/sdi /dev/sdj /dev/sdd Check : Iad and Fusion Manager consistent ========================================= Check Description Resu
PAGE 56
• CPU. Statistics about processor and CPU activity. • NFS. Statistics about NFS client and server activity. The management console GUI displays most of these statistics on the dashboard. See “Using the GUI” (page 15) for more information.
PAGE 57
9 Using the Statistics tool The Statistics tool reports historical performance data for the cluster or for an individual file serving node. You can view data for the network, the operating system, file systems, memory, and block devices. Statistics data is transmitted from each file serving node to the management console, which controls processing and report generation. Installing and configuring the Statistics tool The Statistics tool has two parts: • Manager process.
PAGE 58
NOTE: Do not run the command on individual nodes. All nodes must be specified in the same command and can be specified in any order. Be sure to use node names, not IP addresses. To test the rsync mechanism, see “Testing rsync access” (page 65). 4. 5. On the active node, edit the /etc/ibrix/stats.conf file to add the age.retain.files=24h parameter. See “Changing the Statistics tool configuration” (page 62). Create a symbolic link from /var/lib/ibrix/histstats to the /local/statstool/ histstats directory.
PAGE 59
The Time View lists the reports in chronological order, and the Table View lists the reports by cluster or server. Click a report to view it. Generating reports To generate a new report, click Request New Report on the X9000 Management Console Historical Reports GUI.
PAGE 60
Enter the specifications for your report and click Submit. The management console will then generate the report. The completed report will appear in the list of reports on the statistics home page. When generating reports, you should be aware of the following: • A report can be generated only from statistics that have been gathered. For example, if you start the tool at 9:40am and ask for a report from 9:00am to 9:30am, the report cannot be generated because data was not gathered for that period.
PAGE 61
You should also configure how long to retain staging data and data stored in the statistics database. The retention periods are controlled by parameters in the /etc/ibrix/stats.conf file. • Aging configuration for staging data: The age.retain.files parameter specifies the number of hours, starting from the current time, to retain collected data. If the parameter is not set, data is retained for 30 days by default.
PAGE 62
Changing the Statistics tool configuration The configuration can be changed only on the management node. To change the configuration, add a configuration parameter and its value to the /etc/ibrix/stats.conf file on the currently active node. The supported configuration changes are: • Interval for data collection. The default value is 15 seconds. To change the interval to 30 seconds, add the following line to the stats.conf file: collector.time=30 • Aging configuration for staging statistics data.
PAGE 63
Management console failover and the Statistics tool configuration In an X9000 Software High Availability scenario, migrate the Stats active management console and the collected data (including reports) to the current active management console. On the current active node: 1. Stop statstool: # /etc/init.d/ibrix_statsagent stop ––agilefm_passive 2.
PAGE 64
10. Run the passive migrator script: # /usr/local/ibrix/stats/bin/stats_passive_migrator 11. Start statstool: # /etc/init.d/ibrix_statsagent start ––agilefm_passive NOTE: Passwordless authentication is required for the migration to succeed. To determine whether passwordless authentication is enabled, see “Configuring shared ssh keys” (page 66) and “Enabling collection and synchronization” (page 57). Migrate reports from the old active node to the current active node.
PAGE 65
#/etc/init.d/ibrix_statsmanager status ibrix_statsmanager (pid 25322) is running... In the output, the pid is the process id of the “master” process. Controlling Statistics tool processes Statistics tool processes on all file serving nodes connected to the active management console can be controlled remotely from the active management console. Use the ibrix_statscontrol tool to start or stop the processes on all connected file serving nodes or on specified hostnames only.
PAGE 66
Log files See /var/log/stats.log for detailed logging for the Statistics tool. (The information includes detailed exceptions and traceback messages). The logs are rolled over at midnight every day and only seven days of compressed statistics logs are retained. The default /var/log/messages log file also includes logging for the Statistics tool, but the messages are short. Configuring shared ssh keys To configure one-way shared ssh keys on the cluster, complete the following steps: 1.
PAGE 67
10 Maintaining the system Shutting down the system To shut down the system completely, first shut down the X9000 software, and then power off the X9720 hardware. Shutting down the X9000 Software Use the following procedure to shut down the X9000 Software. Unless noted otherwise, run the commands from the dedicated Management Console or from the node hosting the active agile management console. 1. Disable HA for all file serving nodes: ibrix_server -m -U 2.
PAGE 68
1. 2. 3. 4. Power on the 9100cx disk capacity block(s). Power on the 9100c controllers. Wait for all controllers to report “on” in the 7-segment display. Power on the file serving nodes. Powering on after a power failure If a power failure occurred, all of the hardware will power on at once when the power is restored. The file serving nodes will boot before the storage is available, preventing file systems from mounting.
PAGE 69
Use one of the following schemes for the reboot: • Reboot the file serving nodes one-at-a-time. • Divide the file serving nodes into two groups, with the nodes in the first group having backups in the second group, and the nodes in the second group having backups in the first group. You can then reboot one group at-a-time. To perform the rolling reboot, complete the following steps on each file serving node: 1. Reboot the node directly from Linux.
PAGE 70
Use the ibrix_host_tune command to list or change host tuning settings: • To list default values and valid ranges for all permitted host tunings: /bin/ibrix_host_tune -L • To tune host parameters on nodes or hostgroups: /bin/ibrix_host_tune -S {-h HOSTLIST|-g GROUPLIST} -o OPTIONLIST Contact HP Support to obtain the values for OPTIONLIST. List the options as option=value pairs, separated by commas. To set host tunings on all clients, include the -g clients option.
PAGE 71
on the physical segment itself, and the ownership data is part of the metadata that the management console distributes to file serving nodes and X9000 clients so that they can locate segments. Migrating specific segments Use the following command to migrate ownership of the segments in LVLIST on file system FSNAME to a new host and update the source host: /bin/ibrix_fs -m -f FSNAME -s LVLIST -h HOSTNAME [-M] [-F] [-N] To force the migration, include -M.
PAGE 72
1. 2. Identify the segment residing on the physical volume to be removed. Select Storage from the Navigator on the management console GUI. Note the file system and segment number on the affected physical volume. Locate other segments on the file system that can accommodate the data being evacuated from the affected segment. Select the file system on the management console GUI and then select Segments from the lower Navigator.
PAGE 73
Maintaining networks Cluster and user network interfaces X9000 Software supports the following logical network interfaces: • Cluster network interface. This network interface carries management console traffic, traffic between file serving nodes, and traffic between file serving nodes and clients. A cluster can have only one cluster interface. For backup purposes, each file serving node and management console can have two cluster NICs. • User network interface.
PAGE 74
Identifying a user network interface for a file serving node To identify a user network interface for specific file serving nodes, use the ibrix_nic command. The interface name (IFNAME) can include only alphanumeric characters and underscores, such as eth1. /bin/ibrix_nic -a -n IFNAME -h HOSTLIST If you are identifying a VIF, add the VIF suffix (:nnnn) to the physical interface name.
PAGE 75
clients when you prefer a network interface, you can force clients to query the management console by executing the command ibrix_lwhost --a on the client or by rebooting the client. Preferring a network interface for a file serving node or Linux X9000 client The first command prefers a network interface for a File Server Node; the second command prefers a network interface for a client.
PAGE 76
Changing the IP address for the cluster interface on a dedicated management console You must change the IP address for the cluster interface on both the file serving nodes and the management console. 1. If High Availability is enabled, disable it by executing ibrix_server -m -U. 2. Unmount the file system from all file serving nodes, and reboot. 3. On each file serving node, locally change the IP address of the cluster interface. 4.
PAGE 77
Deleting a network interface Before deleting the interface used as the cluster interface on a file serving node, you must assign a new interface as the cluster interface. See “Changing the cluster interface” (page 76). To delete a network interface, use the following command: /bin/ibrix_nic -d -n IFNAME -h HOSTLIST The following command deletes interface eth3 from file serving nodes s1.hp.com and s2.hp.com: /bin/ibrix_nic -d -n eth3 -h s1.hp.com,s2.hp.
PAGE 78
11 Migrating to an agile management console configuration The agile management console configuration provides one active management console and one passive management console installed on different file serving nodes in the cluster. The migration procedure configures the current Management Server blade as a host for an agile management console and installs another instance of the agile management console on a file serving node.
PAGE 79
In the command, is the old cluster IP address for the original management console and is the new IP address you acquired. For example: [root@x109s1 ~]# ibrix_fm -c 172.16.3.1 -d bond0:1 -n 255.255.248.0 -v cluster -I 172.16.3.100 Command succeeded! The original cluster IP address is now configured to the newly created cluster VIF device (bond0:1). 5.
PAGE 80
11. Verify that there is only one management console in this cluster: ibrix_fm -f For example: [root@x109s1 ~]# ibrix_fm -f NAME IP ADDRESS ------ ---------X109s1 172.16.3.100 Command succeeded! 12. Install a passive agile management console on a second file serving node. In the command, the -F option forces the overwrite of the new_lvm2_uuid file that was installed with the X9000 Software.
PAGE 81
1. On the node hosting the active management console, place the management console into maintenance mode. This step fails over the active management console role to the node currently hosting the passive agile management console. /bin/ibrix_fm –m maintenance 2.
PAGE 82
12 Upgrading the X9000 Software This chapter describes how to upgrade to the latest X9000 File Serving Software release. The management console and all file serving nodes must be upgraded to the new release at the same time. Note the following: • Upgrades to the X9000 Software 6.0 release are supported for systems currently running X9000 Software 5.5.x and 5.6.x. If your system is running an earlier release, first upgrade to the latest 5.5 release, and then upgrade to 6.0. • The upgrade to 6.
PAGE 83
7. On all nodes hosting the passive management console, place the management console into maintenance mode: /bin/ibrix_fm –m maintenance 8. On the active management console node, disable automated failover on all file serving nodes: /bin/ibrix_server -m -U 9. Run the following command to verify that automated failover is off. In the output, the HA column should display off. /bin/ibrix_server -l 10.
PAGE 84
nodes. The management console is in active mode on the node where the upgrade was run, and is in passive mode on the other file serving nodes. If the cluster includes a dedicated Management Server, the management console is installed in passive mode on that server. 5. 6. Upgrade Linux X9000 clients. See “Upgrading Linux X9000 clients” (page 91). If you received a new license from HP, install it as described in the “Licensing” chapter in this guide. After the upgrade Complete the following steps: 1.
PAGE 85
--------ib121-121 ib121-122 ---------10.10.121.121 10.10.121.122 If there is a mismatch on your system, you will see errors when connecting to ports 1234 and 9009. To correct this condition, see “Moving the management console VIF to bond1” (page 94). Offline upgrades for X9000 Software 5.5.x to 6.0 The upgrade from X9000 Software 5.5.x to 6.0 is supported only as an offline upgrade. Because it requires an upgrade of the kernel, the local disk must be reformatted.
PAGE 86
1. 2. 3. 4. 5. Ensure that all nodes are up and running. To determine the status of the cluster nodes, check the dashboard on the GUI or use the ibrix_health command. Obtain the latest HP X9000 Quick Restore DVD version 6.0 ISO image from the HP kiosk at http://www.software.hp.com/kiosk (you will need your HP-provided login credentials). Unmount file systems on Linux X9000 clients. Copy the .iso file onto the server hosting the current active management console.
PAGE 87
After the upgrade Complete the following steps: 1. Run the following command to rediscover physical volumes: ibrix_pv -a 2. 3. Apply any custom tuning parameters, such as mount options. Remount all file systems: ibrix_mount -f -m 4. Re-enable High Availability if used: ibrix_server -m 5. 6. Start any Remote Replication, Rebalancer, or data tiering tasks that were stopped before the upgrade.
PAGE 88
Manual upgrade procedure The manual upgrade process requires external storage that will be used to save the cluster configuration. Each server must be able to access this media directly, not through a network, as the network configuration is part of the saved configuration. HP recommends that you use a USB stick or DVD. NOTE: Be sure to read all instructions before starting the upgrade procedure. To determine which node is hosting the agile management console configuration, run the ibrix_fm -i command.
PAGE 89
Use kill -9 to stop any likewise services that are still running. If you are using NFS, verify that all NFS processes are stopped: ps –ef | grep nfs If necessary, use the following command to stop NFS services: /etc/init.d/nfs stop Use kill -9 to stop any NFS processes that are still running.
PAGE 90
5. When the following screen appears, enter qr to install the X9000 software on the file serving node. The server reboots automatically after the software is installed. Remove the DVD from the DVD-ROM drive. Restoring the node configuration Complete the following steps on each node, starting with the previous active management console node: 1. Log in to the node. The configuration wizard should pop up. Escape out of the configuration wizard. 2.
PAGE 91
script also ensures that the management console is available on all file serving nodes and installs the management console in passive mode on any dedicated Management Servers. 4. 5. 6. Upgrade Linux X9000 clients. See “Upgrading Linux X9000 clients” (page 91). If you received a new license from HP, install it as described in the “Licensing” chapter in this document. Run the following command to rediscover physical volumes: ibrix_pv -a 7. 8.
PAGE 92
1. 2. 3. Download the latest HP X9000 Client 6.0 package. Expand the tar file. Run the upgrade script: ./ibrixupgrade -f The upgrade software automatically stops the necessary services and restarts them when the upgrade is complete. 4. Execute the following command to verify the client is running X9000 software: /etc/init.d/ibrix_client status IBRIX Filesystem Drivers loaded IBRIX IAD Server (pid 3208) running... The IAD service should be running, as shown in the previous sample output.
PAGE 93
/usr/local/ibrix/autocfg/bin/ibrixapp upgrade –f –s • If the install of the new image succeeds, but the configuration restore fails and you need to revert the server to the previous install, run the following command and then reboot the machine. This step causes the server to boot from the old version (the alternate partition). /usr/local/ibrix/setup/upgrade/boot_info –r • If the public network interface is down and inaccessible for any node, power cycle that node.
PAGE 94
[root@ib51-102 ~]# ibrix_fm -f NAME IP ADDRESS -------- ---------ib51-101 10.10.51.101 ib51-102 10.10.51.102 [root@ib51-102 ~]# ibrix_fm -i FusionServer: ib51-102 (active, quorum is running) ================================================== File system unmount issues If a file system does not unmount successfully, perform the following steps on all servers: 1. Run the following commands: chkconfig ibrix_server off chkconfig ibrix_ndmp off chkconfig ibrix_fusionmanager off 2. 3. Reboot all servers.
PAGE 95
4. On the active agile management console, re-register all backup management consoles using the bond1 Local Cluster IP address for each node: # ibrix_fm –R —I NOTE: When registering a management console, be sure the hostname specified with -R matches the hostname of the server. 5. Return the backup management consoles to passive mode: # ibrix_fm –m passive 6. Place the active management console into maintenance mode to force it to fail over.
PAGE 96
13 Licensing This chapter describes how to view your current license terms and how to obtain and install new X9000 Software product license keys. Viewing license terms The X9000 Software license file is stored in the installation directory on the management console. To view the license from the management console GUI, select Cluster Configuration in the Navigator and then select License.
PAGE 97
14 Upgrading the X9720 Network Storage System hardware and firmware WARNING! Before performing any of the procedures in this chapter, read the important warnings, precautions, and safety information in “Warnings and precautions” (page 172) and “Regulatory compliance notices” (page 176). Upgrading firmware IMPORTANT: The X9720 system is shipped with the correct firmware and drivers.
PAGE 98
4. 5. 6. Install the software on the server blade. The Quick Restore DVD is used for this purpose. See “Recovering the X9720 Network Storage System” (page 138) for more information. Set up fail over. For more information, see the HP s X9000 File Serving Software User Guide. Enable high availability (automated failover) by running the following command on server 1: # ibrix_server –m 7. Discover storage on the server blade: ibrix_pv -a 8.
PAGE 99
Carton contents • HP X9700c, containing 12 disk drives • HP X9700cx (also known as HP 600 Modular Disk System [MDS600]), containing 70 disk drives • Rack mounting hardware • Two-meter cables (quantity—4) • Four-meter cables (quantity—2) Where to install the capacity blocks Base cabinet additional capacity blocks 1 X9700c 4 6 X9700cx 3 2 X9700c 3 7 TFT monitor and keyboard 3 X9700c 2 8 c-Class Blade Enclosure 4 X9700c 1 9 X9700cx 2 5 X9700cx 4 10 X9700cx 1 Expansion cabinet additional ca
PAGE 100
System, the X9700c 5 component goes in slots U31 through 32 (see callout 4), and the X9700cx 5 goes in slots U1 through U5 (see callout 8). 1 X9700c 8 5 X9700cx 8 2 X9700c 7 6 X9700cx 7 3 X9700c 6 7 X9700cx 6 4 X9700c 5 8 X9700cx 5 Installation procedure Add the capacity blocks one at a time, until the system contains the maximum it can hold. The factory pre-provisions the additional capacity blocks with the standard LUN layout and capacity block settings (for example, rebuild priority).
PAGE 101
1. Secure the front end of the rails to the cabinet in the correct location. NOTE: 2. 3. 4. Identify the left (L) and right (R) rack rails by markings stamped into the sheet metal. Secure the back end of the rails to the cabinet. Insert the X9700c into the cabinet. Use the thumbscrews on the front of the chassis to secure it to the cabinet. Step 2—Install X9700cx in the cabinet WARNING! Do not remove the disk drives before inserting the X9700cx into the cabinet.
PAGE 102
1 X9700c 2 X9700cx primary I/O module (drawer 2) 3 X9700cx secondary I/O module (drawer 2) 4 X9700cx primary I/O module (drawer 1) 5 X9700cx secondary I/O module (drawer 1) Step 4—Cable the X9700c to SAS switches Using the two 4-meter cables, cable the X9700c to the SAS switch ports in the c-Class Blade Enclosure, as shown in the following illustrations for cabling the base or expansion cabinet. Base cabinet Callouts 1 through 3 indicate additional X9700c components.
PAGE 103
4 X9700c 1 5 SAS switch ports 1 through 4 (in interconnect bay 3 of the c-Class Blade Enclosure). Ports 2 through 4 are used by additional capacity blocks. 6 Reserved for expansion cabinet use. 7 SAS switch ports 1 through 4 (in interconnect bay 4 of the c-Class Blade Enclosure). Ports 2 through 4 are used by additional capacity blocks. 8 Reserved for expansion cabinet use. Expansion cabinet 1 X9700c 8 2 X9700c 7 3 X9700c 6 4 X9700c 5 5 Used by base cabinet.
PAGE 104
The X9720 Network Storage System cabinet comes with the power cords tied to the cabinet. Connect the power cords to the X9700cx first, and then connect the power cords to the X9700c. IMPORTANT: If your X9720 Network Storage System cabinet contains more than two capacity blocks, you must connect all the PDUs to a power source. Step 6—Power on the X9700c and X9700cx components Power on the X9700cx first, then power on the X9700c. Step 7—Discover the capacity block and validate firmware versions 1. 2. 3.
PAGE 105
1. Identify the name of the registered vendor storage: ibrix_vs –l Un-register the existing vendor storage: ibrix_vs –d –n STORAGENAME 2. Register the vendor storage. In the command, the IP, USERNAME, and PASSWORD are for the OA. ibrix_vs –r –n STORAGENAME –t exds –I IP(s) –U USERNAME –P PASSWORD For more information about ibrix_vs, see the HP X9000 File Serving Software CLI Reference Guide.
PAGE 106
2. 3. Delete the volume groups, logical volumes, and physical volumes associated with the LUN. Disconnect the SAS cables connecting both array controllers to the SAS switches. CAUTION: Ensure that you remove the correct capacity block. Removing the wrong capacity block could result in data that is inaccessible.
PAGE 107
15 Troubleshooting Collecting information for HP Support with Ibrix Collect Ibrix Collect is an enhancement to the log collection utility that already exists in X9000 (Support Ticket). If system issues occur, you can collect relevant information for diagnosis by HP Support. The collection can be triggered manually using the GUI or CLI, or automatically during a system crash.
PAGE 108
4. Click Okay. To collect logs and command results using the CLI, use the following command: ibrix_collect –c –n NAME NOTE: Only one manual collection of data is allowed at a time. NOTE: When a node restores from a system crash, the vmcore under /var/crash/ directory is processed. Once processed, the directory will be renamed /var/ crash/_PROCESSED. HP Support may request that you send this information to assist in resolving the system crash.
PAGE 109
NOTE: Only one collection can be downloaded at a time. NOTE: The average size of the archive file depends on the size of the logs present on individual nodes in the cluster. NOTE: You may later be asked to email this final zip file to HP Support. Be aware that the final zip file is not the same as the zip file that you receive in your email. Configuring Ibrix Collect You can configure data collection to occur automatically upon a system crash.
PAGE 110
c. d. Under Email Settings, enable or disable sending cluster configuration by email by checking or unchecking the appropriate box. Fill in the remaining required fields for the cluster configuration and click Okay. To set up email settings to send cluster configurations using the CLI, use the following command: ibrix_collect –C –m [–s ] [–f ] [–t ] NOTE: More than one email ID can be specified for -t option, separated by a semicolon.
PAGE 111
The escalate tool needs the root password to perform some actions. Be prepared to enter the root password when prompted. There are a few useful options; however, you can usually run without options. The -h option displays the available options. It is normal for the escalate command to take a long time (over 20 minutes). When the escalate tool finishes, it generates a report and stores it in a file such as /exds_glory1_escalate.tgz.gz. Copy this file to another system and send it to HP Services.
PAGE 112
Accessing the Onboard Administrator (OA) via service port Each OA has a service port (this is the right-most Ethernet port on the OA). This allows you to use a laptop to access the OA command line interface. See HP BladeSystem c7000 Enclosure Setup and Installation Guide for instructions on how to connect a laptop to the service port. Using hpacucli – Array Configuration Utility (ACU) The hpacucli command is a command line interface to the X9700c controllers.
PAGE 113
hba PAPWV0F9SXA00S P700m in 7930RFCC fw 5.74 boxes 0 disks 0 luns 0 batteries 0/cache switch HP.3G.SAS.BL.SWH in 4A fw 2.72 switch HP.3G.SAS.BL.SWH in 3A fw 2.72 switch HP.3G.SAS.BL.SWH in 4B fw 2.72 switch HP.3G.SAS.BL.SWH in 3B fw 2.72 ctlr P89A40A9SV600X ExDS9100cc in 01/USP7030EKR slot 1 fw 0126.2008120502 boxes 3 disks 80 luns 10 batteries 2/OK cache OK box 1 ExDS9100c sn USP7030EKR fw 1.56 temp OK fans OK,OK,OK,OK power OK,OK box 2 ExDS9100cx sn CN881502JE fw 1.
PAGE 114
Sample output exds_netperf The exds_netperf tool measures network performance. The tool measures performance between a client system and the X9720 Network Storage System. Run this test when the system is first installed. Where networks are working correctly, the performance results should match the expected link rate of the network, that is, for a 1– link, expect about 90 MB/s. You can also run the test at other times to determine if degradation has occurred.
PAGE 115
• On the client host, run exds_netperf in serial mode against each X9720 Network Storage System server in turn. For example, if there are two servers whose eth2 addresses are 16.123.123.1 and 16.123.123.2, use the following command: # exds_netperf --serial --server “16.123.123.1 16.123.123.2” • On a client host, run exds_netperf in parallel mode, as shown in the following example.
PAGE 116
Identifying failed I/O modules on an X9700cx chassis When an X9700cx I/O module (or the SAS cable connected to it) fails, the X9700c controller attached to the I/O module reboots and if the I/O module does not immediately recover, the X9700c controller stays halted. Because there are two X9700cx I/O modules, it is not immediately obvious which I/O module has failed. In addition, the X9700c controller may halt or appear to fail for other reasons.
PAGE 117
1. Verify that SAS cables are connected to the correct controller and I/O module. The following diagram shows the correct wiring of the SAS cables. 1. X9700c 2. X9700cx primary I/O module (drawer 2) 3. X9700cx secondary I/O module (drawer 2) 4. X9700cx primary I/O module (drawer 1) 5.
PAGE 118
3. Check the SAS cables connecting the halted X9700c controller and the X9700cx I/O modules. Disconnect and re-insert the SAS cables at both ends. In particular, ensure that the SAS cable is fully inserted into the I/O module and that the bottom port on the X9700cx I/O module is being used. If there are obvious signs of damage to a cable, replace the SAS cable. 4. Re-seat the halted X9700c controller: a. Push the controller fully into the chassis until it engages. b.
PAGE 119
c. d. e. f. g. h. Wait for the controller to boot. • If the seven-segment display shows “on,” then the fault has been corrected and the system has returned to normal and you can proceed to step 11. • If the seven-segment continues to shows an Hn 67 or Cn 02 code, continue to the next step. If the fault does not clear, remove the left I/O module and reinsert the original I/O module.
PAGE 120
12. Run the exds_stdiag command to verify the firmware version. Check that the firmware is the same on both drawers (boxes) of the X9700cx. Following is an example of exds_stdiag output: ... ctlr P89A40C9SW705J box 1 ExDS9100c box 2 ExDS9100cx box 3 ExDS9100cx ExDS9100cc in 01/SGA830000M slot 1 fw 0126.2008120502 boxes 3 disks 22 luns 5 sn SGA830000M fw 1.56 fans OK,OK,OK,OK temp OK power OK,OK sn CN8827002Z fw 1.28 fans OK,OK temp OK power OK,OK,FAILED,OK sn CN8827002Z fw 2.
PAGE 121
Troubleshooting specific issues Software services Cannot start services on the management console, a file serving node, or a Linux X9000 client SELinux might be enabled. To determine the current state of SELinux, use the getenforce command. If it returns enforcing, disable SELinux using either of these commands: setenforce Permissive setenforce 0 To permanently disable SELinux, edit its configuration file (/etc/selinux/config) and set SELINUX=parameter to either permissive or disabled.
PAGE 122
bonding for additional bandwidth. However, mode 6 bonding is more sensitive to issues in the s network topology, and has been seen to cause storms of ARP traffic when deployed. X9000 RPC call to host failed In /var/log/messages on a file serving node (segment server), you may see messages such as: ibr_process_status(): Err: RPC call to host=wodao6 failed, error=-651, func=IDE_FSYNC_prepacked If you see these messages persistently, contact HP Services as soon as possible.
PAGE 123
LUN status is failed A LUN status of failed indicates that the logical drive has failed. This is usually the result of failure of three or more disk drives. This can also happen if you remove the wrong disk drive when replacing a failed disk drive. If this situation occurs, take the following steps: 1. Carefully record any recent disk removal or reinsertion actions. Make sure you track the array, box, and bay numbers and know which disk drive was removed or inserted. 2.
PAGE 124
11. Perform the following steps For each X9700c controller in turn: a. Slide out controller until LEDs extinguish. b. Reinsert controller. c. Wait for the seven-segment to show "on". d. Run the exds_stdiag command on affected server. e. If ok, the procedure is completed; otherwise, repeat steps a through d on next the controller. 12. If the above steps do not produce results, replace the HP P700m. 13. Boot server and run exds_stdiag, 14.
PAGE 125
the GSI light after each replacement. See Replacing components in the HP ExDS9100 Storage System for replacement instructions. X9700cx drive LEDs are amber after firmware is flashed If the X9700cx drive LEDs are amber after the firmware is flashed, try power cycling the X9700cx again. Configuring the Virtual Connect domain Once configured, the Virtual Connect domain should not need any reconfiguration.
PAGE 126
Synchronizing information on file serving nodes and the configuration database To maintain access to a file system, file serving nodes must have current information about the file system. HP recommends that you execute ibrix_health on a regular basis to monitor the health of this information. If the information becomes outdated on a file serving node, execute ibrix_dbck -o to resynchronize the server’s information with the configuration database.
PAGE 127
16 Replacing components in the X9720 Network Storage System Customer replaceable components WARNING! Before performing any of the procedures in this chapter, read the important warnings, precautions, and safety information in “Warnings and precautions” (page 172) and “Regulatory compliance notices” (page 176). IMPORTANT: To avoid unintended consequences, HP recommends that you perform the procedures in this chapter during scheduled maintenance times.
PAGE 128
Hot-pluggable and non-hot-pluggable components Before removing any serviceable part, determine whether the part is hot-pluggable or non-hot-pluggable. • If the component is hot-pluggable, a power shutdown of the device is not required for replacement of the part. • If the component is not hot-pluggable, the device must be powered down. Returning the defective component In the materials shipped with a CSR component, HP specifies whether the defective component must be returned to HP.
PAGE 129
• ◦ HP 3 SAS BL Switch Installation Instructions ◦ HP 3 SAS BL Switch Customer Self Repair Instructions X9700cx ◦ HP 600 Modular Disk System Maintenance and Service Guide Replacing the c7000 blade enclosure and server blade parts NOTE: This section contains information and procedures specific to the X9720 Network Storage System. For complete instructions on the topics in this section, see the HP BladeSystem c7000 Enclosure Maintenance and Service Guide. Replacing the blade enclosure 1. 2.
PAGE 130
Replacing a server blade disk drive The “system disk” on an X9720 Network Storage System server blade comprises a logical RAID 1 disk which is mirrored over two physical SFF disk drives in the server blade. If one drive fails, the remaining drive continues to service I/O for the server. You do not need to shut down the server blade; disk drives can be hot swapped. However, you must replace the removed drive with a drive of the same size. To replace a disk drive in the server blade: 1.
PAGE 131
4. 5. Reconnect the cable that was disconnected in step 1. Remove and then reconnect the uplink to the customer network for bay 2. NOTE: Clients lose connectivity during this procedure unless you are using a bonded network. After the new VC module is inserted, network connectivity to the X9720 may be lost for approximately 5 seconds while the new module is configured. Alerts may be generated during this period.
PAGE 132
bladebay 7 add zonegroup=exds_zonegroup bladebay 8 add zonegroup=exds_zonegroup bladebay 9 add zonegroup=exds_zonegroup bladebay 10 add zonegroup=exds_zonegroup bladebay 11 add zonegroup=exds_zonegroup bladebay 12 add zonegroup=exds_zonegroup bladebay 13 add zonegroup=exds_zonegroup bladebay 14 add zonegroup=exds_zonegroup bladebay 15 add zonegroup=exds_zonegroup bladebay 16 add zonegroup=exds_zonegroup switch local saveupdate 10.
PAGE 133
Replacing capacity block hard disk drive Replace a hard disk drive (HDD) when it is reported in the failed or predict-fail state. The exds_stdiag command reports these states. When a drive is in a predict-fail state, it flashes amber and alerts are generated. HP recommends replacing drives with predictive failures within 24 hours. IMPORTANT: Before replacing a failed hard disk drive, perform the following steps: 1. Check the global service indicator (GSI) light on the front panel of the hard drive drawer.
PAGE 134
1. 2. Remove the SAS cable in port 1 that connects the X9700c to the SAS switch in the c-Class blade enclosure. Do not remove the two SAS expansion cables that connect the X9700c controller to the I/O controllers on the X9700cx enclosure. Slide the X9700c controller partially out of the chassis: a. Squeeze the controller thumb latch and rotate the latch handle down. b. Pull the controller straight out of the chassis until it has clearly disengaged. 3.
PAGE 135
See the HP ExDS9100c/X9720 Storage System Controller Battery Customer Self Repair Instructions for more information. Replacing the X9700c power supply The X9720 Network Storage System can operate using one power supply. You can hot swap a power supply. 1. Remove the old power cord. 2. Remove the power supply module. 3. Insert a new power supply module. See the MSA6X/7X Series Enclosure Power Supply Replacement Instructions for more information.
PAGE 136
Replacing the X9700cx I/O module IMPORTANT: You might need to change the firmware of a replaced I/O module; therefore, schedule system downtime of approximately one hour to perform this procedure. There are four I/O modules in an X9700cx chassis—two I/O modules (primary/secondary) for each of the drawers: two on the left, two on the right. Within each drawer you can hot swap one I/O module at a time. Disconnecting both I/O modules interrupts I/O operations.
PAGE 137
... ctlr P89A40C9SW705J box 1 ExDS9100c box 2 ExDS9100cx box 3 ExDS9100cx ExDS9100cc in 01/SGA830000M slot 1 fw 0126.2008120502 boxes 3 disks 22 luns 5 sn SGA830000M fw 1.56 fans OK,OK,OK,OK temp OK power OK,OK sn CN8827002Z fw 1.28 fans OK,OK temp OK power OK,OK,FAILED,OK sn CN8827002Z fw 2.03 fans OK,OK temp OK power OK,OK,OK,OK In this example, the array serial number (box 1) is SGA830000M. The firmware level on box 2 (left drawer of X9700cx) is 1.28. The firmware level on box 3 (right drawer) is 2.03.
PAGE 138
17 Recovering the X9720 Network Storage System The instructions in this section are necessary in the following situations: • The X9720 fails and must be recovered. • A server blade is added or replaced. • A file serving node fails and must be replaced. You will need to create a QuickRestore DVD, as described later, and then install it on the affected blade. This step installs the operating system and X9000 Software on the blade and launches a configuration wizard.
PAGE 139
The server reboots automatically after the installation is complete. Remove the DVD from the USB DVD drive. 7. The Configuration Wizard starts automatically. To configure a file serving node, use one of the following procedures: • When your cluster was configured initially, the installer may have created a template for configuring file serving nodes. To use this template to configure the file serving node undergoing recovery, go to “Configuring a file serving node using the original template” (page 139).
PAGE 140
3. The Configuration Wizard attempts to discover management consoles on the network and then displays the results. Select the appropriate management console for this cluster. NOTE: If the list does not include the appropriate management console, or you want to customize the cluster configuration for the file serving node, select Cancel. Go to “Configuring a file serving node manually” (page 144) for information about completing the configuration. 4.
PAGE 141
5. The Verify Configuration window shows the configuration received from the management console. Select Accept to apply the configuration to the server and register the server with the management console. NOTE: If you select Reject, the wizard exits and the shell prompt is displayed. You can restart the Wizard by entering the command /usr/local/ibrix/autocfg/bin/menu_ss_wizard or logging in to the server again. 6.
PAGE 142
4. The QuickRestore DVD enables the iptables firewall. Either make the firewall configuration match that of your other server blades to allow traffic on appropriate ports, or disable the service entirely by running the chkconfig iptables off and service iptables stop commands. To allow traffic on appropriate ports, open the following ports: 5.
PAGE 143
1. If the restored node was previously configured to perform domain authorization for CIFS services, run the following command: ibrix_auth -n DOMAIN_NAME -A AUTH_PROXY_USER_NAME@domain_name [-P AUTH_PROXY_PASSWORD] -h HOSTNAME For example: ibrix_auth -n ibq1.mycompany.com -A Administrator@ibq1.mycompany.com -P password -h ib5-9 If the command fails, check the following: 2. • Verify that DNS services are running on the node where you ran the ibrix_auth command.
PAGE 144
Configuring a file serving node manually Use this procedure to configure file serving nodes manually instead of using the template. (You can launch the wizard manually by entering the command /usr/local/ibrix/autocfg/bin/menu_ss_wizard.) 1. Log into the system as user root (the default password is hpinvent). 2. When the System Deployment Menu appears, select Join an existing cluster. 3. The Configuration Wizard attempts to discover management consoles on the network and then displays the results.
PAGE 145
5. The Cluster Configuration Menu lists the configuration parameters that you need to set. Use the Up and Down arrow keys to select an item in the list. When you have made your select, press Tab to move to the buttons at the bottom of the dialog box, and press Space to go to the next dialog box. 6. Select Management Console from the menu, and enter the IP address of the management console. This is typically the address of the management console on the cluster network.
PAGE 146
7. Select Hostname from the menu, and enter the hostname of this server. 8. Select Time Zone from the menu, and then use Up or Down to select your time zone.
PAGE 147
9. Select Default Gateway from the menu, and enter the IP Address of the host that will be used as the default gateway. 10. Select DNS Settings from the menu, and enter the IP addresses for the primary and secondary DNS servers that will be used to resolve domain names. Also enter the DNS domain name.
PAGE 148
11. Select Networks from the menu. Select to create a bond for the cluster network. You are creating a bonded interface for the cluster network; select Ok on the Select Interface Type dialog box.
PAGE 149
Enter a name for the interface (bond0 for the cluster interface) and specify the appropriate options and slave devices. The factory defaults for the slave devices are eth0 and eth3. Use Mode 6 bonding for 1GbE networks and Mode 1 bonding for 10GbE networks. 12. When the Configure Network dialog box reappears, select bond0.
PAGE 150
13. To complete the bond0 configuration, enter a space to select the Cluster Network role. Then enter the IP address and netmask information that the network will use. Repeat this procedure to create a bonded user network (typically bond1 with eth1 and eth2) and any custom networks as required. 14. When you have completed your entries on the File Serving Node Configuration Menu, select Continue. 15.
PAGE 151
16. If the hostname specified for the node already exists in the cluster (the name was used by the node you are replacing), the Replace Existing Server window asks whether you want to replace the existing server with the node you are configuring. When you click Yes, the replacement node will be registered. IMPORTANT: Next, go to “Completing the restore on a file serving node” (page 141).
PAGE 152
18 Support and other resources Contacting HP For worldwide technical support information, see the HP support website: http://www.hp.com/support Before contacting HP, collect the following information: • Product model names and numbers • Technical support registration number (if applicable) • Product serial numbers • Error messages • Operating system type and revision level • Detailed questions Related information Related documents are available on the Manuals page at http://www.hp.
PAGE 153
Installing and maintaining the HP 3Gb SAS BL Switch • HP 3Gb SAS BL Switch Installation Instructions • HP 3Gb SAS BL Switch Customer Self Repair Instructions On the Manuals page, click bladesystem > BladeSystem Interconnects > HP BladeSystem SAS Interconnects. Maintaining the X9700cx (also known as the HP 600 Modular Disk System) • HP 600 Modular Disk System Maintenance and Service Guide Describes removal and replacement procedures.
PAGE 154
Subscription service HP recommends that you register your product at the Subscriber's Choice for Business website: http://www.hp.com/go/e-updates After registering, you will receive email notification of product enhancements, new driver versions, firmware updates, and other product resources.
PAGE 155
A Component and cabling diagrams Base and expansion cabinets A minimum X9720 Network Storage System base cabinet has from 3 to 16 performance blocks (that is, server blades) and from 1 to 4 capacity blocks. An expansion cabinet can support up to four more capacity blocks, bringing the system to eight capacity blocks. The first server blade is configured as the management console. The other servers are configured as file serving nodes.
PAGE 156
Back view of a base cabinet with one capacity block 1. Management switch 2 2. Management switch 1 3. X9700c 1 4. TFT monitor and keyboard 5. c-Class Blade enclosure 6.
PAGE 157
Front view of a full base cabinet 1 X9700c 4 6 X9700cx 3 2 X9700c 3 7 TFT monitor and keyboard 3 X9700c 2 8 c-Class Blade Enclosure 4 X9700c 1 9 X9700cx 2 5 X9700cx 4 10 X9700cx 1 Base and expansion cabinets 157
PAGE 158
Back view of a full base cabinet 158 1 Management switch 2 7 X9700cx 4 2 Management switch 1 8 X9700cx 3 3 X9700c 4 9 TFT monitor and keyboard 4 X9700c 3 10 c-Class Blade Enclosure 5 X9700c 2 11 X9700cx 2 6 X9700c 1 12 X9700cx 1 Component and cabling diagrams
PAGE 159
Front view of an expansion cabinet The optional X9700 expansion cabinet can contain from one to four capacity blocks. The following diagram shows a front view of an expansion cabinet with four capacity blocks. 1. X9700c 8 5. X9700cx 8 2. X9700c 7 6. X9700cx 7 3. X9700c 6 7. X9700cx 6 4. X9700c 5 8.
PAGE 160
Back view of an expansion cabinet with four capacity blocks 1. X9700c 8 5. X9700cx 8 2. X9700c 7 6. X9700cx 7 3. X9700c 6 7. X9700cx 6 4. X9700c 5 8. X9700cx 5 Performance blocks (c-Class Blade enclosure) A performance block is a special server blade for the X9720. Server blades are numbered according to their bay number in the blade enclosure. Server 1 is in bay 1 in the blade enclosure, and so on. Server blades must be contiguous; empty blade bays are not allowed between server blades.
PAGE 161
Rear view of a c-Class Blade enclosure 1. Interconnect bay 1 (Virtual Connect Flex-10 10 Ethernet Module) 6. Interconnect bay 6 (reserved for future use) 2. Interconnect bay 2 (Virtual Connect Flex-10 7. Interconnect bay 7 (reserved for future use) 10 Ethernet Module) 3. Interconnect bay 3 (SAS Switch) 8. Interconnect bay 8 (reserved for future use) 4. Interconnect bay 4 (SAS Switch) 9. Onboard Administrator 1 5. Interconnect bay 5 (reserved for future use) 10.
PAGE 162
The X9720 Network Storage System automatically reserves eth0 and eth3 and creates a bonded device, bond0. This is the management network. Although eth0 and eth3 are physically connected to the Flex-10 Virtual Connect (VC) modules, the VC domain is configured so that this network is not seen by the site network. With this configuration, eth1 and eth2 are available for connecting each server blade to the site network.
PAGE 163
X9700c (array controller with 12 disk drives) Front view of an X9700c 1. Bay 1 5. Power LED 2. Bay 2 6. System fault LED 3. Bay 3 7. UID LED 4. Bay 4 8. Bay 12 Rear view of an X9700c 1. Battery 1 9. Fan 2 2. Battery 2 10. X9700c controller 2 3. SAS expander port 1 11. SAS expander port 2 4. UID 12. SAS port 1 5. Power LED 13. X9700c controller 1 6. System fault LED 14. Fan 1 7. On/Off power button 15. Power supply 1 8.
PAGE 164
Front view of an X9700cx 1. Drawer 1 2. Drawer 2 Rear view of an X9700cx 1. Power supply 5. In SAS port 2. Primary I/O module drawer 2 6. Secondary I/O module drawer 1 3. Primary I/O module drawer 1 7. Secondary I/O module drawer 2 4. Out SAS port 8. Fan Cabling diagrams Capacity block cabling—Base and expansion cabinets A capacity block is comprised of the X9700c and X9700cx. CAUTION: Correct cabling of the capacity block is critical for proper X9720 Network Storage System operation.
PAGE 165
1 X9700c 2 X9700cx primary I/O module (drawer 2) 3 X9700cx secondary I/O module (drawer 2) 4 X9700cx primary I/O module (drawer 1) 5 X9700cx secondary I/O module (drawer 1) Virtual Connect Flex-10 Ethernet module cabling—Base cabinet Site network Onboard Administrator Available uplink port 1. Management switch 2 7. Bay 5 (reserved for future use) 2. Management switch 1 8. Bay 6 (reserved for future use) 3.
PAGE 166
SAS switch cabling—Base cabinet NOTE: Callouts 1 through 3 indicate additional X9700c components. 1 X9700c 4 2 X9700c 3 3 X9700c 2 4 X9700c 1 5 SAS switch ports 1through 4 (in interconnect bay 3 of the c-Class Blade Enclosure). Ports 2 through 4 are reserved for additional capacity blocks. 6 SAS switch ports 5 through 8 (in interconnect bay 3 of the c-Class Blade Enclosure). Reserved for expansion cabinet use.
PAGE 167
1 X9700c 8 5 SAS switch ports 1 through 4 (in interconnect bay 3 of the c-Class Blade Enclosure). Used by base cabinet. 2 X9700c 7 6 SAS switch ports 5 through 8 (in interconnect bay 3 of the c-Class Blade Enclosure). 3 X9700c 6 7 SAS switch ports 1 through 4 (in interconnect bay 4 of the c-Class Blade Enclosure). 4 X9700c 5 8 SAS switch ports 5 through 8 (in interconnect bay 4 of the c-Class Blade Enclosure). Used by base cabinet.
PAGE 168
B Spare parts list The following tables list spare parts (both customer replaceable and non customer replaceable) for the X9720 Network Storage System components. The spare parts information is current as of the publication date of this document. For the latest spare parts information, go to http:// partsurfer.hp.com. Spare parts are categorized as follows: • Mandatory. Parts for which customer self repair is mandatory.
PAGE 169
Description Spare part number Customer self repair SPS-STICK,4X FIXED,C-13,OFFSET,WW 483915-001 Optional SPS-BRACKETS,PDU 252641-001 Optional SPS-SPS-STICK,ATTACH'D CBL,C13 0-1FT 419595-001 Mandatory Spare part number Customer self repair SPS-FAN, SYSTEM 413996-001 Mandatory SPS-BD, MID PLANE ASSY 519345-001 No SPS-SLEEVE, ONBRD ADM 519346-001 Mandatory SPS-MODULE, OA, DDR2 503826-001 Mandatory SPS-LCD MODULE, WIDESCREEN ASSY 519349-001 No SPS-P/S,2450W,12V,HTPLG 500242-001 Ma
PAGE 170
Description Spare part number Customer self repair SPS-BD,BATTERY CHARGER,MOD,4/V700HT 462976-001 Mandatory SPS-BD,MEM,MOD,256MB,40B 462974-001 Mandatory SPS-MISC CABLE KIT 511789-001 Mandatory AW551A—X9700 82TB Capacity Block (X9700c and X9700cx) Note the following: • The X9700c midplane is used for communication between controllers. • There are 2x backplanes in the X9700c.
PAGE 171
Description Spare part number Customer self repair SPS-CA,EXT MINI SAS, 2M 408767-001 Mandatory SPS-CA,EXT MINI SAS, 4M 408768-001 Mandatory AW598B—X9700 164TB Capacity Block (X9700c and X9700cx) Note the following: • The X9700c midplane is used for communication between controllers. • There are 2x backplanes in the X9700c.
PAGE 172
C Warnings and precautions Electrostatic discharge information To prevent damage to the system, be aware of the precautions you need to follow when setting up the system or handling parts. A discharge of static electricity from a finger or other conductor could damage system boards or other static-sensitive devices. This type of damage could reduce the life expectancy of the device.
PAGE 173
Equipment symbols If the following symbols are located on equipment, hazardous conditions could exist. WARNING! Any enclosed surface or area of the equipment marked with these symbols indicates the presence of electrical shock hazards. Enclosed area contains no operator serviceable parts. To reduce the risk of injury from electrical shock hazards, do not open this enclosure. WARNING! Any RJ-45 receptacle marked with these symbols indicates a network interface connection.
PAGE 174
WARNING! To reduce the risk of personal injury or damage to the equipment: • Observe local occupational safety requirements and guidelines for heavy equipment handling. • Obtain adequate assistance to lift and stabilize the product during installation or removal. • Extend the leveling jacks to the floor. • Rest the full weight of the rack on the leveling jacks. • Attach stabilizing feet to the rack if it is a single-rack installation.
PAGE 175
WARNING! To reduce the risk of personal injury or damage to the equipment, the installation of non-hot-pluggable components should be performed only by individuals who are qualified in servicing computer equipment, knowledgeable about the procedures and precautions, and trained to deal with products capable of producing hazardous energy levels.
PAGE 176
D Regulatory compliance notices Regulatory compliance identification numbers For the purpose of regulatory compliance certifications and identification, this product has been assigned a unique regulatory model number. The regulatory model number can be found on the product nameplate label, along with all required approval markings and information. When requesting compliance information for this product, always refer to this regulatory model number.
PAGE 177
off and on, the user is encouraged to try to correct the interference by one or more of the following measures: • Reorient or relocate the receiving antenna. • Increase the separation between the equipment and receiver. • Connect the equipment into an outlet on a circuit that is different from that to which the receiver is connected. • Consult the dealer or an experienced radio or television technician for help.
PAGE 178
This compliance is indicated by the following conformity marking placed on the product: This marking is valid for non-Telecom products and EU harmonized Telecom products (e.g., Bluetooth). Certificates can be obtained from http://www.hp.com/go/certificates.
PAGE 179
Class B equipment Taiwanese notices BSMI Class A notice Taiwan battery recycle statement Turkish recycling notice Türkiye Cumhuriyeti: EEE Yönetmeliğine Uygundur Vietnamese Information Technology and Communications compliance marking Taiwanese notices 179
PAGE 180
Laser compliance notices English laser notice This device may contain a laser that is classified as a Class 1 Laser Product in accordance with U.S. FDA regulations and the IEC 60825-1. The product does not emit hazardous laser radiation. WARNING! Use of controls or adjustments or performance of procedures other than those specified herein or in the laser product's installation guide may result in hazardous radiation exposure.
PAGE 181
German laser notice Italian laser notice Japanese laser notice Laser compliance notices 181
PAGE 182
Spanish laser notice Recycling notices English recycling notice Disposal of waste equipment by users in private household in the European Union This symbol means do not dispose of your product with your other household waste. Instead, you should protect human health and the environment by handing over your waste equipment to a designated collection point for the recycling of waste electrical and electronic equipment.
PAGE 183
Dutch recycling notice Inzameling van afgedankte apparatuur van particuliere huishoudens in de Europese Unie Dit symbool betekent dat het product niet mag worden gedeponeerd bij het overige huishoudelijke afval. Bescherm de gezondheid en het milieu door afgedankte apparatuur in te leveren bij een hiervoor bestemd inzamelpunt voor recycling van afgedankte elektrische en elektronische apparatuur. Neem voor meer informatie contact op met uw gemeentereinigingsdienst.
PAGE 184
Hungarian recycling notice A hulladék anyagok megsemmisítése az Európai Unió háztartásaiban Ez a szimbólum azt jelzi, hogy a készüléket nem szabad a háztartási hulladékkal együtt kidobni. Ehelyett a leselejtezett berendezéseknek az elektromos vagy elektronikus hulladék átvételére kijelölt helyen történő beszolgáltatásával megóvja az emberi egészséget és a környezetet.További információt a helyi köztisztasági vállalattól kaphat.
PAGE 185
Portuguese recycling notice Descarte de equipamentos usados por utilizadores domésticos na União Europeia Este símbolo indica que não deve descartar o seu produto juntamente com os outros lixos domiciliares. Ao invés disso, deve proteger a saúde humana e o meio ambiente levando o seu equipamento para descarte em um ponto de recolha destinado à reciclagem de resíduos de equipamentos eléctricos e electrónicos. Para obter mais informações, contacte o seu serviço de tratamento de resíduos domésticos.
PAGE 186
Recycling notices English recycling notice Bulgarian recycling notice Czech recycling notice 186 Regulatory compliance notices
PAGE 187
Danish recycling notice Dutch recycling notice Estonian recycling notice Finnish recycling notice Recycling notices 187
PAGE 188
French recycling notice German recycling notice Greek recycling notice 188 Regulatory compliance notices
PAGE 189
Hungarian recycling notice Italian recycling notice Latvian recycling notice Recycling notices 189
PAGE 190
Lithuanian recycling notice Polish recycling notice Portuguese recycling notice 190 Regulatory compliance notices
PAGE 191
Romanian recycling notice Slovak recycling notice Spanish recycling notice Recycling notices 191
PAGE 192
Swedish recycling notice Battery replacement notices Dutch battery notice 192 Regulatory compliance notices
PAGE 193
French battery notice German battery notice Battery replacement notices 193
PAGE 194
Italian battery notice Japanese battery notice 194 Regulatory compliance notices
PAGE 195
Spanish battery notice Battery replacement notices 195
PAGE 196
Glossary ACE Access control entry. ACL Access control list. ADS Active Directory Service. ALB Advanced load balancing. BMC Baseboard Management Configuration. CIFS Common Internet File System. The protocol used in Windows environments for shared folders. CLI Command-line interface. An interface comprised of various commands which are used to control operating system responses. CSR Customer self repair. DAS Direct attach storage.
PAGE 197
SELinux Security-Enhanced Linux. SFU Microsoft Services for UNIX. SID Secondary controller identifier number. SNMP Simple Network Management Protocol. TCP/IP Transmission Control Protocol/Internet Protocol. UDP User Datagram Protocol. UID Unit identification. USM SNMP User Security Model. VACM SNMP View Access Control Model. VC HP Virtual Connect. VIF Virtual interface. WINS Windows Internet Naming Service. WWN World Wide Name. A unique identifier assigned to a Fibre Channel device.
PAGE 198
Index A ACU using hpacucli, 112 adding server blades, 97 agile management console, 28 Array Configuration Utility using hpacucli, 112 AutoPass, 96 B backups file systems, 44 management console configuration, 44 NDMP applications, 44 battery replacement notices, 192 blade enclosure replacing, 129 booting server blades, 14 booting X9720, 14 C cabling diagrams, 164 Canadian notice, 177 capacity block overview, 162 capacity blocks removing, 105 replacing disk drive, 133 clients access virtual interfaces, 26 c
PAGE 199
run health check, 126 start or stop processes, 69 troubleshooting, 121 tune, 69 view process status, 69 file systems segments migrate, 70 Flex-10 networks, 161 G grounding methods, 172 H hazardous conditions symbols on equipment, 173 HBAs delete HBAs, 35 delete standby port pairings, 35 discover, 35 identify standby-paired ports, 35 list information, 35 monitor for high availability, 34 monitoring, turn on or off, 35 health check reports, 53 help obtaining, 152 High Availability agile management console,
PAGE 200
network interfaces add routing table entries, 76 bonded and virtual interfaces, 73 defined, 73 delete, 77 delete monitors, 34 delete routing table entries, 76 delete standbys, 34 guidelines, 25 identify monitors, 34 identify standbys, 33 set up monitoring, 32 viewing, 77 network testing, 113 NIC failover, 26 O OA accessing via serial port, 111 accessing via service port, 112 replacing, 130 Onboard Administrator accessing via serial port, 111 accessing via service port, 112 replacing, 130 P P700m mezzanine
PAGE 201
system startup after power failure, 68 T Taiwanese notices, 179 technical support HP, 152 service locator website, 153 troubleshooting, 107 escalating issues, 110 U upgrades Linux X9000 clients, 91 X9000 Software, 82 automatic, 85 manual, 88 user network interface add, 73 configuration rules, 76 defined, 73 identify for X9000 clients, 74 modify, 74 prefer, 74 unprefer, 75 V VC module replacing, 130 Virtual Connect domain, configure, 125 Virtual Connect module replacing, 130 virtual interfaces, 25 bonded,