HP StorageWorks Scalable File Share Client Installation and User Guide Version 2.2 Product Version: HP StorageWorks Scalable File Share Version 2.
© Copyright 2005, 2006 Hewlett-Packard Development Company, L.P. Lustre® is a registered trademark of Cluster File Systems, Inc. Linux is a U.S. registered trademark of Linus Torvalds. Quadrics® is a registered trademark of Quadrics, Ltd. Myrinet® and Myricom® are registered trademarks of Myricom, Inc. InfiniBand® is a registered trademark and service mark of the InfiniBand Trade Association Microsoft and Windows are U.S. registered trademarks of Microsoft Corporation.
Contents About this guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii 1 Overview 1.1 Overview of the Lustre file system ............................................................................................... 1-2 1.2 Overview of HP SFS ................................................................................................................. 1-3 1.3 HP SFS client configurations..................................................
3.3.2 Step 2: Installing the client software ......................................................................................3-13 3.3.3 Step 3: Running the sfsconfig command after installing the software..........................................3-16 3.3.4 Step 4: Completing other configuration tasks .........................................................................3-17 3.3.4.1 Configuring interconnect interfaces ..................................................................................
6.2 Dealing with ENOSPC or EIO errors ........................................................................................... 6-4 6.2.1 Determining the file system capacity using the lfs df command................................................... 6-5 6.2.2 Dealing with insufficient inodes on a file system....................................................................... 6-5 6.2.3 Freeing up space on OST services .....................................................................................
vi
About this guide This guide describes how to install and configure the HP StorageWorks Scalable File Share (HP SFS) client software on client nodes that will use Lustre® file systems on HP SFS systems. It also includes instructions for mounting and unmounting file systems on client nodes.
HP SFS documentation The HP StorageWorks Scalable File Share documentation set consists of the following documents: • HP StorageWorks Scalable File Share Release Notes • HP StorageWorks Scalable File Share for EVA4000 Hardware Installation Guide • HP StorageWorks Scalable File Share for SFS20 Enclosure Hardware Installation Guide • HP StorageWorks Scalable File Share System Installation and Upgrade Guide • HP StorageWorks Scalable File Share System User Guide • HP StorageWorks Scalable File Share
Naming conventions This section lists the naming conventions used for an HP SFS system in this guide. You are free to choose your own name for your HP SFS system. System Component Value Name of the HP SFS system (the system alias) south Name of the HP SFS administration server south1 Name of the HP SFS MDS server south2 For more information For more information about HP products, access the HP Web site at the following URL: www.hp.
x
1 Overview HP StorageWorks Scalable File Share Version 2.2 (based on Lustre® technology) is a product from HP that uses the Lustre File System (from Cluster File Systems, Inc.). An HP StorageWorks Scalable File Share (HP SFS) system is a set of independent servers and storage subsystems combined through system software and networking technologies into a unified system that provides a storage system for standalone servers and/or compute clusters.
1.1 Overview of the Lustre file system Lustre is a design for a networked file system that is coherent, scalable, parallel, and targeted towards high performance computing (HPC) environments. Lustre separates access to file data from access to file metadata. File data is accessed through an object interface, which provides a higher level of access than a basic block store. Each logical file store is called an Object Storage Target (OST) service.
A typical Lustre file system consists of multiple Object Storage Servers that have storage attached to them. At present, the Object Storage Servers are Linux servers, but it is anticipated that in the future the Object Storage Servers may be storage appliances that run Lustre protocols. The Object Storage Servers are internetworked over potentially multiple networks to Lustre client nodes, which must run a version of the Linux operating system.
HP SFS Version 2.2-0 software has been tested successfully with the following interconnect types: • Gigabit Ethernet interconnect • Quadrics interconnect (QsNetII) (from Quadrics, Ltd.) • Myrinet interconnect (Myrinet XP and Myrinet 2XP) (from Myricom, Inc.) • Voltaire InfiniBand interconnect (HCA 400) (from Voltaire, Inc.) For details of the required firmware versions for Voltaire InfinBand interconnect adapters, refer to Appendix A in the HP StorageWorks Scalable File Share Release Notes. 1.3.
1.3.2.1 Supported upgrade paths for HP SFS and HP XC configurations The supported upgrade paths for HP SFS and HP XC configurations are shown in Table 1-1. Table 1-1 Supported upgrade paths for HP SFS and HP XC configurations Existing Versions XC Version SFS Client Version SFS Server Version 2.1 PK02 2.1-1 2.1-1 Can be Upgraded To XC Version Upgrade Path SFS Client Version SFS Server Version Recommended configurations Upgrade all 3.1 2.2-0 2.2-0 Upgrade all 3.0 PK02 2.2-0 2.
1.3.3 HP SFS with RHEL and SLES 9 SP3 client configurations In addition to HP XC systems (as described in Section 1.3.2), the HP SFS Version 2.2-0 client software has been tested and shown to work successfully with a number of other client configurations. The tested configurations are listed in Section 1.3.3.1. HP has also identified a number of client configurations that are likely to work successfully with HP SFS Version 2.2-0 but have not been fully tested. These configurations are listed in Section 1.3.
Table 1-2 Tested client configurations Architecture • i686 Distribution SLES 9 SP32 Kernel Version Interconnect 2.6.5-7.244 • Gigabit Ethernet interconnect • ia64 • ia32e • x86_64 1. In subsequent releases of the HP SFS product, HP will not test or support Red Hat Enterprise Linux 2.1 AS client systems as Lustre clients. 2. The versions of the Lustre client software (not the kernel) shipped with SLES 9 SP3 are obsolete, and are not compatible with HP SFS Version 2.2-0.
Table 1-3 Untested client configurations Architecture Distribution Kernel Version Interconnect • ia32e RHEL 4 Update 2 2.6.9-22.0.2.EL • Gigabit Ethernet interconnect • Quadrics interconnect (QsNetII) (from Quadrics, Ltd.) Version 5.23.2 • Myrinet interconnect (Myrinet XP and Myrinet 2XP) (from Myricom, Inc.) Version 2.1.26 • Voltaire InfiniBand Interconnect Version 3.5.5 • i686 RHEL 4 Update 1 2.6.9-11.EL • Gigabit Ethernet interconnect • ia64 RHEL 4 Update 1 2.6.9-11.
Table 1-3 Untested client configurations Architecture Distribution Kernel Version Interconnect • ia32e RHEL 3 Update 6 2.4.21-37.EL • Gigabit Ethernet interconnect • Quadrics interconnect (QsNetII) (from Quadrics, Ltd.) Version 5.23.2 • Myrinet interconnect (Myrinet XP and Myrinet 2XP) (from Myricom, Inc.) Version 2.1.26 • Voltaire InfiniBand Interconnect Version 3.4.5 • i686 RHEL 3 Update 5 2.4.21-32.0.1.EL • Gigabit Ethernet interconnect RHEL 3 Update 5 2.4.21-32.0.1.
1–10 Overview
2 Installing and configuring HP XC systems To allow client nodes to mount the Lustre file systems on an HP SFS system, the HP SFS client software and certain other software components must be installed and configured on the client nodes. This chapter describes how to perform these tasks on HP XC systems. This chapter is organized as follows: • HP SFS client software for HP XC systems (Section 2.1) • Installing the HP SFS client software on HP XC systems (new installations) (Section 2.
2.1 HP SFS client software for HP XC systems The prebuilt packages that you will need for installing the HP SFS client software on your HP XC systems are provided on the HP StorageWorks Scalable File Share Client Software CD-ROM. The packages are located in the arch/distro directory. • The possible architectures are ia64, x86_64, and ia32e (em64t). • There is one directory for each supported version of the HP XC distribution.
3. The binary distribution directory contains a number of subdirectories, with one subdirectory for each architecture. Within each subdirectory, there is an XC directory containing binary RPM files. Identify the correct directory for the architecture on your client node, then change to that directory, as shown in the following example. In this example, the architecture is ia64 and the HP XC software version is 3.0: # cd ia64/XC_3.
When you have finished configuring the options lnet and lquota settings, proceed to Section 2.2.3 to complete the remaining additional configuration tasks. 2.2.3 Step 3: Completing other configuration tasks on the head node To complete the configuration of the head node, perform the following tasks: 1. Configure interconnect interfaces (see Section 2.2.3.1). 2. Configure the NTP server (see Section 2.2.3.2). 3. Configure firewalls (see Section 2.2.3.3). 4.
2.2.3.2 Configuring the NTP server For the HP SFS diagnostics to work correctly, the date and time on the client nodes must be synchronized with the date and time on other client nodes, and with the date and time on the servers in the HP SFS system. In addition, synchronizing the date and time on the systems keeps the logs on the systems synchronized, and is helpful when diagnosing problems.
In both cases, the name of the HP SFS system must resolve on the HP XC system. HP recommends you do this in the /etc/hosts file. Verify that the alias works—for example, use the ssh(1) command to log on to the HP SFS system. When you have finished adding and verifying the HP SFS server alias, proceed to Section 2.2.6 to verify that each file system can be mounted. 2.2.
• The mount operation may stall for up to ten minutes. Do not interrupt the mount operation—as soon as the file system moves to the started state, the mount operation will complete. If the mount operation has not completed after ten minutes, you must investigate the cause of the failure further. See Section 7.2.1 of this guide, and Chapter 9 of the HP StorageWorks Scalable File Share System User Guide for information on troubleshooting mount operation failures. 2.2.7 Step 7: Creating the /etc/sfstab.
Example 2-1 Sample /etc/sfstab.proto file #% n1044 lnet://10.0.128.2@vib0,10.0.128.1@vib0:/south-mds10/client_vib /hptc_cluster server=south,fs=hptc_cluster 0 0 lnet://10.0.128.2@vib0,10.0.128.1@vib0:/south-mds9/client_vib /data sfs bg,server=south,fs=data 0 0 #% n[1-256] lnet://10.0.128.2@vib0,10.0.128.1@vib0:/south-mds10/client_vib /hptc_cluster max_cached_mb=2,max_rpcs_in_flight=2,server=south,fs=hptc_cluster 0 0 lnet://10.0.128.2@vib0,10.0.128.
2.3 Upgrading HP SFS client software on existing HP XC systems The HP XC version on the client nodes must be capable of interoperating with the HP SFS server and client versions. In addition, the HP SFS client version must be capable of interoperating with the HP SFS server version on the servers in the HP SFS system. See Section 1.3.2 for details of which HP XC and HP SFS versions can interoperate successfully. To upgrade existing HP XC systems, perform the following tasks: 1.
4. Remove all of the existing HP SFS RPM files on the head node in the order in which they were installed, as shown in the following example: NOTE: In the example shown here, the python-ldap package is removed. This package needs to be removed only on HP Integrity systems. Omit this command on all other systems. # rpm -ev lustre-modules-version_number \ lustre-lite-version_number \ python-ldap-version_number \ hpls-lustre-client-version_number \ hpls-diags-client-version_number 5. Reboot the head node.
NOTE: The sfsconfig command uses the http: protocol to get configuration information from the HP SFS servers. If the head node does not have access to the HP SFS servers over a TCP/IP network, or if the servers are offline, the sfsconfig command will not be able to configure the head node correctly, and you will have to modify the configuration file manually. For instructions on how to do this, see Appendix B. When you have finished running the sfsconfig command, proceed to Section 2.3.
specifically the section titled Disabling Portals compatibility mode (when client nodes have been upgraded). 3. On the head node in the HP XC system, edit the /etc/modprobe.conf.lustre file and change the portals_compatibility setting to none. 4. Use the cluster_config utility to update the golden image with the modified /etc/modprobe.conf.lustre file. 5. Propagate the modified /etc/modprobe.conf.lustre file to all nodes. 6. Remount the Lustre file systems on the nodes.
2. Unmount all Lustre file systems on the head node, as follows: # sfsumount -a 3. Remove all of the existing HP SFS RPM files on the head node in the order in which they were installed, as shown in the following example: NOTE: In the example shown here, the python-ldap package is removed. This package needs to be removed only on HP Integrity systems. Omit this command on all other systems.
2–14 Installing and configuring HP XC systems
3 Installing and configuring Red Hat Enterprise Linux and SUSE Linux Enterprise Server 9 SP3 client systems To allow client nodes to mount the Lustre file systems on an HP SFS system, the HP SFS client software and certain other software components must be installed and configured on the client nodes. This chapter describes how to perform these tasks on Red Hat Enterprise Linux (RHEL) and SUSE Linux Enterprise Server 9 SP3 (SLES 9 SP3) systems.
3.1 HP SFS client software for RHEL and SLES 9 SP3 systems The SFS Client Enabler is on the HP StorageWorks Scalable File Share Client Software CD-ROM in the client_enabler/ directory. The layout of the directory is as follows: client_enabler/VERSION /build_SFS_client.sh /src/common/autotools/autoconf-version.tar.gz /automake-version.tar.gz /cfgs/build configuration files /diags_client/diags_client.tgz /gm/gm sources /kernels/vendor/dist/kernel sources /lustre/lustre-version.
• ia32e/RHEL3.0_U8 • ia64/RHEL3.0_U8 • x86_64/RHEL3.
• Locating the python-ldap and hpls-diags-client packages (Section 3.2.4) • List of patches in the client-rh-2.4.21-32 series file (Section 3.2.5) • Additional patches (Section 3.2.6) 3.2.1 Prerequisites for the SFS Client Enabler To build a customized HP SFS client kit using the SFS Client Enabler, you must have the following resources: • An appropriate system on which to perform the build.
• Additional Lustre patches (some distributions only). Where appropriate, you will find additional Lustre patches in the client_enabler/src/arch/ distro/lustre_patches directory on the HP StorageWorks Scalable File Share Client Software CD-ROM. In this directory, you will find a file that lists the patches to be applied and the order in which they must be applied. Not all distributions require patches to the Lustre sources, so this directory may or may not exist for your particular distribution.
• If you are building on a SLES 9 SP3 system, you must make sure that the /usr/src/packages/ [BUILD|SOURCES|SPECS] directories are all empty. You must also have an appropriate kernelsource package installed and if your kernel is already built in the /usr/src/linux directory, add the --prebuilt_kernel option to the command line when you run the build_SFS_client.sh script.
• Myrinet interconnect: • To add support for the Myrinet interconnect driver, add the following to the command line: --config gm • To change the gm source RPM file used, add the following to the command line: --gm path_to_gm_driver_source_RPM • • To drop the Myrinet option, specify "" as the path.
3. Edit the bootloader configuration file so that the new kernel is selected as the default for booting. If your boot loader is GRUB, you can alternatively use the /sbin/grubby --set-default command, as shown in the following example: # grubby --set-default /boot/vmlinuz-2.4.21-37.EL_SFS2.2_0 4. Reboot the system to boot the new kernel, as follows: # reboot 5. Copy the built Linux tree to the /usr/src/linux directory, as follows: # mkdir -p /usr/src/linux # cd /usr/src/linux # (cd /build/SFS_client_V2.
When a Voltaire InfiniBand Version 3.4.5 interconnect driver is used, the ARP resolution parameter on each of the client nodes must be changed after the HP SFS client software has been installed on the client node. This task is included in the installation instructions provided later in this chapter (see Step 11 in Section 3.3.2). 3.2.3 Output from the SFS Client Enabler The build_SFS_client.sh script creates output .rpm files in architecture-specific directories.
• hpls-diags-client Use the version of the hpls-diags-client package that you built when you created the HP SFS client kit; however, if the package failed to build, you can find the hpls-diags-client package (for some architectures and distributions) on the HP StorageWorks Scalable File Share Client Software CD-ROM, in the appropriate directory for your particular client architecture/distribution combination. 3.2.5 List of patches in the client-rh-2.4.
• listman-2.4.21-chaos.patch Adds 2.6 kernel-compatible list utilities for use in Lustre. • bug2707_fixed-2.4.21-rh.patch Provides a bugfix for a race between create and chmod which can cause files to be inaccessible. • inode-max-readahead-2.4.24.patch Allows individual file systems to have varying readahead limits. Used by Lustre to set its own readahead limits. • export-show_task-2.4-rhel.patch Exports the show_task kernel symbol. • compile-fixes-2.4.21-rhel_hawk.
3.3 Installing the HP SFS client software on RHEL and SLES 9 SP3 systems (new installations) NOTE: HP does not provide prebuilt binary packages for installing the HP SFS client software for RHEL and SLES 9 SP3 systems. You must build your own HP SFS client kit as described in Section 3.2 and then install some prerequisite packages and the HP SFS client software. The HP SFS client version must be capable of interoperating with the HP SFS server version on the servers in the HP SFS system. See Section 1.3.
3.3.2 Step 2: Installing the client software To install the HP SFS client software on a client node, perform the following steps: 1. Mount the HP StorageWorks Scalable File Share Client Software CD-ROM on the target client node, as follows: # mount /dev/cdrom /mnt/cdrom 2. Change to the top level directory, as follows: # cd /mnt/cdrom 3. The distribution directory contains a number of subdirectories, with one subdirectory for each architecture.
Note the following points: • In kits where the gm package is provided, it must be installed even if no Myrinet interconnect is used. The package is needed to resolve symbols in the Lustre software. • The hpls-lustre-client package requires the openldap-clients package. The openldap-clients package is usually part of your Linux distribution. • The lustre package requires the python2 and python-ldap packages. The python2 package is usually part of your Linux distribution.
hpls-lustre-client-version_number.rpm \ hpls-diags-client-version_number.rpm NOTE: The kernel package makes a callout to the new-kernel-pkg utility to update the boot loader with the new kernel image. Ensure that the correct boot loader (GRUB, Lilo, and so on) has been updated. The installation of the package does not necessarily make the new kernel the default for booting—you may need to edit the appropriate bootloader configuration file so that the new kernel is selected as the default for booting.
TIP: Alternatively, you can use the ib-setup tool to configure this setting on each client node. 11. This step applies only if a Voltaire InfiniBand Version 3.4.5 interconnect driver is used. You must change the ARP resolution parameter on each of the client nodes. By default, this parameter is set to Dynamic Path Query; you must now update it to Static Path Query (unless there is a specific reason why it needs to be set to Dynamic Path Query).
When the script has completed, examine the /etc/modprobe.conf.lustre or / etc/modules.conf.lustre file and the /etc/modprobe.conf or /etc/modules.conf file to ensure that the options lnet settings and the lquota settings have been added (see Appendix B for more information on the settings). Note that the sfsconfig command uses the http: protocol to get configuration information from the HP SFS servers.
It is possible to restrict the interfaces that a client node uses to communicate with the HP SFS system by editing the options lnet settings in the /etc/modprobe.conf or /etc/modules.conf file; see Appendix B. 3.3.4.1.2 Configuring Voltaire InfiniBand interfaces If the HP SFS system uses a partitioned InfiniBand interconnect, you may need to configure additional InfiniBand IP (IPoIB) interfaces on the client node.
3.3.4.3 Configuring the NTP server For the HP SFS diagnostics to work correctly, the date and time on the client nodes must be synchronized with the date and time on other client nodes, and with the date and time on the servers in the HP SFS system. In addition, synchronizing the date and time on the systems keeps the logs on the systems synchronized, and is helpful when diagnosing problems.
3.4.1 Step 1: Upgrading the HP SFS client software To upgrade the HP SFS client software on RHEL and SLES 9 SP3 client systems, perform the following steps: 1. On the node that you are going to upgrade, stop all jobs that are using Lustre file systems. To determine what processes on a client node are using a Lustre file system, enter the fuser command as shown in the following example, where /data is the mount point of the file system.
If a modified version of the python-ldap package is not provided for your client architecture/ distribution on the HP StorageWorks Scalable File Share Client Software CD-ROM for the version you are upgrading to, you do not need to remove (and reinstall) the python-ldap package. In the example shown here, the python-ldap package is removed.
• Examine the /etc/sfstab and /etc/sfstab.proto files to ensure that the mount directives using the lnet: protocol have been added. NOTE: The sfsconfig command uses the http: protocol to get configuration information from the HP SFS servers. If the client node does not have access to the HP SFS servers over a TCP/IP network, or if the servers are offline, the sfsconfig command will not be able to configure the client node correctly, and you will have to modify the configuration file manually.
user3 user3 user3 user3 user3 user3 user1 22513 31820 31847 31850 31950 31951 32572 ..c.. ..c.. ..c.. ..c.. ..c.. ..c.. ..c.. csh res 1105102082.1160 1105102082.
7. Replace or edit the /etc/sfstab.proto file on the client node, as follows: • If you saved a copy of the /etc/sfstab.proto file during the upgrade process, replace the /etc/sfstab.proto file on the client node with the older (saved) version of the file. • If you did not save a copy of the /etc/sfstab.proto file during the upgrade process, you must edit the /etc/sfstab.
4 Mounting and unmounting Lustre file systems on client nodes This chapter provides information on mounting and unmounting file systems on client nodes, and on configuring client nodes to mount file system at boot time. The topics covered include the following: • Overview (Section 4.1) • Mounting Lustre file systems using the sfsmount command with the lnet: protocol (Section 4.2) • Mounting Lustre file systems using the mount command (Section 4.
4.1 Overview NOTE: Before you attempt to mount a Lustre file system on a client node, make sure that the node has been configured as described in Chapter 2 or Chapter 3. In particular, the client node must have an options lnet setting configured in the /etc/modprobe.conf.lustre or /etc/modules.conf.lustre file. A Lustre file system can be mounted using either the sfsmount(8) command (the recommended method) or the standard mount(8) command with a file system type of lustre.
that you convert existing systems to use the lnet: protocol. The process for converting from the ldap: protocol to the lnet: protocol is described in Chapter 2 (for HP XC systems) and Chapter 3 (for other types of client systems). A Lustre file system comprises a number of MDS and OST services. A Lustre file system cannot be mounted on a client node until all of the file system services are running.
TIP: If the client node has access to the HP SFS system on a TCP/IP network, you can generate the correct address to be used in the sfsmount command with the lnet: protocol by entering the sfsmount command with the -X option and the http: protocol, as shown in the following example: # sfsmount -X http://south/test /mnt/test 4.3 Mounting Lustre file systems using the mount command NOTE: Lustre file systems must be mounted as root user, and the environment—in particular the PATH— must be that of root.
mds_service is the name of the MDS service on the HP SFS system (as shown by the sfsmgr show filesystem command on the HP SFS server) For example: south-mds3 profile Is in the format client_type, and type is one of tcp, elan, gm, vib 4.5 Mount options Table 4-1 describes the options that can be specified in the -o option list with the mount command and/or the sfsmount command for Lustre file systems.
Table 4-1 Mount options 4–6 Name mount and/or sfsmount Description [no]repeat sfsmount Specifies whether repeated attempts are to be made to mount the file system (until the mount operation succeeds), or if only one attempt is to be made to mount the file system. When the sfsmount command is run interactively, the default for this option is norepeat. When the sfsmount command is used by the SFS service, the SFS service adds the repeat mount option unless the /etc/sfstab or /etc/sfstab.
Table 4-1 Mount options 4.6 Name mount and/or sfsmount Description fs=name N/A Specifies the name of the file system. This option is ignored by the sfsmount command. The option allows the sfsconfig command to process the appropriate file system. It is good practice to use the fs option when using the lnet: protocol; otherwise, it is hard to determine the file system name. See Appendix A for a description of the sfsconfig command.
filesystem Specifies the name of the Lustre file system that is to be unmounted. mountpoint Specifies the mount point of the file system that is to be unmounted. This is the recommended argument. Do not include a trailing slash (/) at the end of the mount point. Table 4-2 lists the options that can be used with the umount and sfsumount commands.
An alternative method of unmounting Lustre file systems on the client node is to enter the service sfs stop command, as described in Section 4.7. However, note that when you run the service sfs stop command, only the file systems specified in the /etc/sfstab file are unmounted. File systems that were mounted manually are not unmounted. 4.7 Using the SFS service This section is organized as follows: • Mounting Lustre file systems at boot time (Section 4.7.
To configure a client node to automatically mount a file system at boot time, perform the following steps: 1. On the client node, create a directory that corresponds to the mount point that was specified for the file system when it was created, as shown in the following example: # mkdir /usr/data 2. Create an entry for the Lustre file system either in the /etc/sfstab file on the client node or in the /etc/sfstab.proto file.
CAUTION: When you move an entry from a client node’s /etc/sfstab file to the /etc/sfstab.proto file, you must delete the entry from the static section of the /etc/sfstab file (that is, the section of the file outside of the lines generated when the /etc/sfstab.proto file is processed). Each mount entry for a client node must only exist either in the /etc/sfstab.proto file or in the static section of the /etc/sfstab file. Example The following is an example of a complete /etc/sfstab.
After the /etc/sfstab.proto file shown above is processed by the SFS service, the /etc/sfstab file on the delta1 node will include the following lines: ##################### BEGIN /etc/sfstab.proto SECTION ##################### . . . lnet://35@elan0,34@elan0:/south-mds3/client_elan /usr/data sfs server=south,fs=data 0 0 lnet://35@elan0,34@elan0:/south-mds4/client_elan /usr/scratch sfs server=south,fs=scratch 0 0 lnet://35@elan0,34@elan0:/south-mds5/client_elan /usr/test sfs server=south,fs=test 0 0 . . .
4.7.6 The service sfs status command The service sfs status command shows information on the status (mounted or unmounted) of Lustre file systems on the client node. 4.7.7 The service sfs cancel command The service sfs cancel command cancels pending mount operations that are taking place in the background.
4.8 Alternative sfsmount modes In addition to supporting the standard mount command with the lnet: protocol (as described in Section 4.4), the sfsmount command also supports the following mount modes: • The standard mount command with the http: protocol. (see Section 4.8.1). • The lconf command with the ldap: protocol (see Section 4.8.2).
4.8.2 Mounting Lustre file systems using the sfsmount command with the ldap: protocol NOTE: Lustre file systems must be mounted as root user, and the environment—in particular the PATH— must be that of root. Do not use the su syntax when when changing to root user; instead, use the following syntax: su NOTE: The network or networks that a client node can use to access the HP SFS system may or may not be configured with an alias IP address.
4.9 Restricting interconnect interfaces on the client node When a Gigabit Ethernet interconnect is used to connect client nodes to an HP SFS system, the default behavior is for only the first Gigabit Ethernet interface on a client node to be added as a possible network for file system traffic. To ensure that the correct interfaces on the client node are available for file system traffic, you must ensure that the options lnet settings in the /etc/modprobe.conf.lustre or /etc/modules.conf.
• If a server in the HP SFS system is shut down or crashes, or if the file system itself is stopped, all client connections go to the DISCONN state. Typically, the connections go back to alternating between the CONNECT state and the DISCONN state after about 50 seconds. The REPLAY_WAIT state indicates that the connection has been established and that the file system is recovering; in this case, the state changes to FULL within a few minutes.
• The following message shows that the client node is attempting to connect to a server in the HP SFS system: kernel: Lustre: 4560:0:(import.c:310:import_select_connection()) MDC_n1044_sfsalias-mds5_MNT_client_vib: Using connection NID_16.123.123.102_UUID In this example, the connection is to the mds5 service on server 16.123.123.102. On its own, this message does not indicate a problem.
5 Configuring NFS and Samba servers to export Lustre file systems HP SFS allows client systems to use the NFS or SMB (using Samba) protocols to access Lustre file systems. If you intend to use this functionality, you must configure one or more HP SFS client nodes as NFS or Samba servers to export the file systems. This chapter provides information on configuring such servers, and is organized as follows: • Configuring NFS servers (Section 5.1) • Configuring Samba servers (Section 5.
5.1 Configuring NFS servers Some legacy client systems can only use the NFS protocol; HP allows such systems to access Lustre file systems via NFS servers. NFS servers are specialized Lustre clients that access the Lustre file system and export access to the file system over NFS. To use this functionality, you must configure one or more HP SFS client nodes as NFS servers for the Lustre file systems.
5.1.2 Configuration factors for NFS servers When configuring HP SFS client nodes as NFS servers, consider the following points: • A Lustre file system may be exported over NFS or over Samba, but may not be exported over both NFS and Samba at the same time. • Multiple HP SFS client nodes configured as NFS servers may export different Lustre file systems. • Multiple HP SFS client nodes configured as NFS servers may export the same Lustre file system.
5.1.3 Configuration factors for multiple NFS servers NFS services may be configured to expand the throughput and performance of the NFS services to the NFS client systems by having multiple NFS servers. The basic setup procedure of an NFS server is not affected by the use of multiple NFS servers; however, the following guidelines are recommended: • All NFS servers that are exporting the same file system must export it by the same name.
5.1.3.2 NFS performance scaling example Figure 5-2 illustrates how performance is affected when the number of NFS servers and client systems is increased in a system configured as in Figure 5-1. Figure 5-2 NFS performance scaling NFS performance scaling HP SFS Version 2.2-0 250000 Default stripe size; Default stripe count nfs_readahead=0; 8GB files Aggregate throughput in KB/sec 200000 150000 Initial write Rewrite Read 100000 Re-Read 50000 0 1 2 3 4 Number of NFS servers and NFS clients 5.1.
5.1.6 Optimizing NFS server performance To optimize NFS performance, consider the following recommendations for the configuration on the HP SFS client node that has been configured as an NFS server: • As part of the installation of the HP SFS client software on client nodes, the kernel on the client node is patched to provide support for Lustre file systems. In addition, patches are supplied to improve NFS client system read performance.
• The functionality that allows Lustre file systems to be exported via Samba is intended for interoperability purposes. When a Lustre file system is exported via Samba, performance will be lower than when the file system is accessed directly by a native HP SFS client system.
5–8 Configuring NFS and Samba servers to export Lustre file systems
6 User interaction with Lustre file systems This chapter is organized as follows: • Defining file stripe patterns (Section 6.1) • Dealing with ENOSPC or EIO errors (Section 6.2) • Using Lustre file systems — performance hints (Section 6.
6.1 Defining file stripe patterns Lustre presents a POSIX API as the file system interface; this means that POSIX-conformant applications work with Lustre. There are occasions when a user who is creating a file may want to create a file with a defined stripe pattern on a Lustre file system. This section describes two methods of doing this: the first method uses the lfs executable (see Section 6.1.1), and the second method uses a C program (see Section 6.1.2).
6.1.2 Using a C program to create a file The following C program fragment shows an example of how to create a file with a defined stripe pattern; the program also determines that the file system is a Lustre file system. Example 6-1 C program fragment—creating a file with a defined stripe pattern # # # # # # # include include include include include include include
6.1.3 Setting a default stripe size on a directory If you want to create many files with the same stripe attributes and you want those files to have a stripe configuration that is not the default stripe configuration of the file system, you can create the files individually as described earlier in this chapter. Alternatively, you can set the stripe configuration on a subdirectory and then create all of the files in that subdirectory.
6.2.1 Determining the file system capacity using the lfs df command You can use the lfs df command to determine if the file system is full, or if one or more of the OST services in the file system are full. You can run the lfs df command as an unprivileged user on a client node (in the same way as the df command). The following is example shows output from the lfs df command.
b. Check the MDS service by entering the command shown in the following example on the client node: # cat /proc/fs/lustre/mdc/MDC_delta57_south-mds5_MNT_client_gm/filesfree 10 # In this example, delta57 is the client node where the command is being run; south is the name of the HP SFS system; mds5 is the name of the MDS service the client node is connected to; and client_gm indicates that a Myrinet interconnect is being used.
south-ost51_UUID south-ost52_UUID filesystem summary: 2113787820 681296236 1432491584 2113787820 532323328 1581464492 8455151280 2579597988 5875553292 32 /mnt/data[OST:2] 25 /mnt/data[OST:3] 30 /mnt/data # 2. Deactivate the OST service, as described in the Managing space on OST services section in Chapter 5 of the HP StorageWorks Scalable File Share System User Guide. 3.
6.3.1.1 Improving the performance of the rm -rf command If the rm -rf command is issued from a single client node to a large directory tree populated with hundreds of thousands of files, the command can sometimes take a long time (in the order of an hour) to complete the operation. The primary reason for this is that each file is unlinked (using the unlink() operation) individually and the transactions must be committed to disk at the server.
6.3.3 Variation of file stripe count with shared file access When multiple client processes are accessing a shared file, aligning the file layout (file stripe size and file stripe count) with the access pattern of the application is beneficial.
Server-side timeouts Server-side timeouts can occur as follows: • When client nodes are connected to MDS and OST services in the HP SFS system, the client nodes ping their server connections at intervals of one quarter of the period specified by the Lustre timeout attribute. If a client node has not been in contact for at least 2.25 times the period specified by the Lustre timeout attribute, the Lustre software proactively evicts the client node.
The parameters that control client operation interact as shown in the following example. In this example, the configuration is as follows: • There are 30 OST services, one on each server in the HP SFS system. • All client nodes and servers are connected to a single switch with an overall throughput of 1Gb/sec. • The max_dirty_mb parameter on the client node is 32MB for each OST service that the client node is communicating with.
6.3.5 Using a Lustre file system in the PATH variable HP strongly recommends that you do not add a Lustre file system into the PATH variable as a means of executing binaries on the Lustre file system. Instead, use full paths for naming those binaries. If it is not possible to exclude a Lustre file system from the PATH variable, the Lustre file system must come as late in the PATH definition as possible, to avoid a lookup penalty on local binary execution.
7 Troubleshooting This chapter provides information for troubleshooting possible problems on client systems. The topics covered include the following: • Installation issues (Section 7.1) • File system mounting issues (Section 7.2) • Operational issues (Section 7.3) • Miscellaneous issues (Section 7.
7.1 Installation issues This section deals with issues that may arise when the HP SFS software is being installed on the client nodes. The section is organized as follows: • The initrd file is not created (Section 7.1.1) • Client node still boots the old kernel after installation (Section 7.1.2) 7.1.1 The initrd file is not created When you have installed the client kernel (see Section 3.3.2), there should be an initrd file (/boot/ initrd-kernel_version.img) on the client node; however, if the modules.
7.1.2 Client node still boots the old kernel after installation If a client node does not boot the new kernel after the HP SFS client software has been installed on the node, it may be because the new kernel has not been defined as the default kernel for booting. To correct this problem, edit the appropriate bootloader configuration file so that the new kernel is selected as the default for booting and then reboot the client node.
• The interconnect may not be functioning correctly. If all of the MDS and OST services associated with the file system are available and the client node has been configured correctly but is still failing to mount or unmount a file system, ensure that the interconnect that the client node is using to communicate with the servers is functioning correctly. If none of the above considerations provides a solution to the failure of the mount or unmount operation, reboot the client node.
To configure Lustre to use a different port on the client node when using a Myrinet interconnect, perform the following steps: 1. On the administration server and on the MDS server in the HP SFS system, perform the following tasks: a. Stop all file systems, by entering the stop filesystem filesystem_name command for each file system. b. Back up the /etc/modprobe.conf file, as follows: # cp /etc/modprobe.conf /etc/modprobe.conf.save c. Edit the /etc/modprobe.
7.2.5 Troubleshooting stalled mount operations If a mount operation stalls, you can troubleshoot the problem on the HP SFS system. Refer to Chapter 9 of the HP StorageWorks Scalable File Share System User Guide (specifically the Troubleshooting client mount failures section) for more information. 7.3 Operational issues This section deals with issues that may arise when client nodes are accessing data on Lustre file systems.
obdidx 0 1 2 3 2. objid 1860 1856 1887 1887 objid 0x744 0x740 0x75f 0x75f group 0 0 0 0 Rename the new file to the original name using the mv command, as shown in the following example: # mv scratch.new scratch mv: overwrite ’scratch’? # lfs getstripe scratch OBDS: 0: ost1_UUID 1: ost2_UUID 2: ost3_UUID 3: ost4_UUID ./scratch obdidx 0 1 2 3 y objid 1860 1856 1887 1887 objid 0x744 0x740 0x75f 0x75f group 0 0 0 0 7.3.
In a situation where only one or two client nodes have crashed and a lock is needed, there is a pause of 6 to 20 seconds while the crashed client nodes are being evicted. When such an event occurs, Lustre attempts to evict clients one by one. A typical log message in this situation is as follows: 2005/09/30 21:02:53 kern i s5 : LustreError: 4952:0:(ldlm_lockd.c:365:ldlm_failed_ast()) ### blocking AST failed (-110): evicting client b9929_workspace_9803d79af3@NET_0xac160393_UUID NID 0xac160393 (172.22.3.
Preventing and correcting the problem You can take action to prevent access to files hanging as described above; however, if you find that an application has already hung, you can take corrective action.
In the following example, the output from a fully dual-connected configuration is shown; in this example delta6 is the client node: [root@delta6 ~]# lctl --net tcp peer_list 12345-10.128.0.72@tcp 12345-10.128.0.72@tcp 12345-10.128.0.73@tcp 12345-10.128.0.73@tcp 12345-10.128.0.74@tcp 12345-10.128.0.74@tcp [1]10.128.0.61->10.128.0.72:988 [0]10.128.8.61->10.128.8.72:988 [1]10.128.0.61->10.128.0.73:988 [0]10.128.8.61->10.128.8.73:988 [1]10.128.0.61->10.128.0.74:988 [0]10.128.8.61->10.128.8.
A Using the sfsconfig command The sfsconfig command is a tool that you can use to automatically perform the following tasks on client nodes: • Configure the correct options lnet settings in the /etc/modprobe.conf and /etc/modprobe.conf.lustre files or the /etc/modules.conf and /etc/modules.conf.lustre files (depending on the client distribution). (In the remainder of this appendix, references to the /etc/modprobe.conf file can be understood to include also the /etc/modprobe.conf.
The sfsconfig command also adds the lquota setting (which is needed to allow the client node to use quotas functionality) in the /etc/modprobe.conf or /etc/modules.conf file. The sfsconfig command also updates the file system mount directives in the /etc/sfstab and /etc/sfstab.proto files.
The target can be one or more of the following: conf Specifies that the /etc/modprobe.conf or /etc/modules.conf file is to be updated. The command configures the options lnet settings for all interconnects that can be used to access the file systems served by the identified or specified servers. You can later edit the /etc/ modprobe.conf or /etc/modules.conf file to restrict the interfaces that can be used for mounting file systems. tab Specifies that the /etc/sfstab and the /etc/sfstab.
The following command adds the required options lnet setting to the /etc/modprobe.conf or /etc/modules.conf file; it also updates the existing lnet: mount directives in the /etc/sfstab and /etc/sfstab.proto files, while keeping the existing ldap: and http: mount directives: # sfsconfig -H -L all Pseudo mount options Note that the following pseudo mount options are provided as part of the /etc/sfstab and /etc/sfstab.proto mount options list for use by the sfsconfig command only.
B Options for Lustre kernel modules This appendix is organized as follows: • Overview (Section B.1) • Setting the options lnet settings (Section B.2) • Modifying the /etc/modprobe.conf file on Linux Version 2.6 client nodes manually (Section B.3) • Modifying the /etc/modules.conf file on Linux Version 2.4 client nodes manually (Section B.
B.1 Overview To support the functionality provided in HP SFS Version 2.2, the /etc/modprobe.conf and /etc/modprobe.conf.lustre files or the /etc/modules.conf and /etc/modules.conf.lustre files (depending on the client distribution) on the HP SFS client nodes must be configured with the appropriate settings. You can use the sfsconfig command to modify the files automatically (see Appendix A), or you can edit the files manually, as described in Section B.3 and Section B.4.
B.2 Setting the options lnet settings The options lnet settings are critical in ensuring both connectivity and performance when client nodes access the HP SFS system. When you are determining the appropriate settings for the client nodes, take account of the following rules: • There can only be one entry for any network type other than Gigabit Ethernet interconnects. • For Gigabit Ethernet networks, match the numerical identifier for a network and the identifier of the server, if possible.
• Two (non-bonded) Gigabit Ethernet interconnects, which the client node uses to access two different HP SFS systems. One of the HP SFS systems is configured with a dual Gigabit Ethernet interconnect; the second HP SFS system is configured with a single Gigabit Ethernet interconnect. The client node accesses only HP SFS Version 2.2 servers that are not running in Portals compatibility mode: options lnet network=tcp0(eth1,eth2),tcp1(eth1) portals_compatibility=none Section B.2.
b. When you have identified an appropriate server for the test, enter the following command on that server, to identify the NID of the server: # lctl list_nids 34@elan 0xdd498cfe@gm 10.128.0.41@tcp # 5. Enter the lctl ping command to verify connectivity, as shown in the following example, where server_nid is the NID identified in Step 4.
B.3 Modifying the /etc/modprobe.conf file on Linux Version 2.6 client nodes manually TIP: You can restrict the Gigabit Ethernet interfaces that a client node uses for interaction with an HP SFS system, by specifying options lnet settings only for the interfaces that are to be used. On client nodes that are running a Linux 2.6 kernel, modify the /etc/modprobe.conf file as follows: 1. Identify the correct settings for the interconnects that are to be used to connect to the HP SFS system. 2.
4. To configure the options lnet settings on the client node, add an entry to the /etc/modules.conf.lustre file to specify the networks that are to be used to connect to the HP SFS system. Use the following syntax: options lnet option1=value1 [option2=value2...] The syntax of the supported options is as follows: networks=network1[,network2...
B–8 Options for Lustre kernel modules
C Building an HP SFS client kit manually This appendix describes how to build an HP SFS client kit manually (that is, not using the sample script provided by HP). The appendix is organized as follows: • Overview (Section C.1) • Building the HP SFS client kit manually (Section C.2) • Output from the SFS Client Enabler (Section C.3) • Locating the python-ldap and hpls-diags-client packages (Section C.
C.1 Overview The build_SFS_client.sh example script provided on the HP StorageWorks Scalable File Share Client Software CD-ROM works for many common distributions, and HP recommends that you use it if possible. The use of the script is described in Section 3.2. However, if the script does not work for your client distribution, you can build the kit manually, as described in this appendix.
6. If you wish to build RPM files, it is best to create an rpmmacros file (if one does not already exist for the build user). This is created in the output/ directory and will result in all RPM activity taking place in that directory, including the resulting RPM files being placed there.
11. If your client has additional Lustre patches listed in the client_enabler/src/arch/distro/ lustre_patches/series file, copy the additional patches from the HP StorageWorks Scalable File Share Client Software CD-ROM to the src/ directory, as shown in the following example: $ cp -p /mnt/cdrom/client_enabler/src/i686/SuSE_9.0/lustre_patches/ SuSE_python2.3_bug2309.patch src/ 12.
d. Add the additional required patches to the kernel.spec file in the same way that you applied the Lustre patches. See Section 3.2.6 for a list of additional patches. You can find the additional patches in the src directory; the series file (which lists the patches) is on the HP StorageWorks Scalable File Share Client Software CD-ROM under the client_enabler/src/arch/distro/patches/ directory.
To create a built kernel tree, perform the following steps: a. Extract the kernel sources from the src/ directory and put them in the build/linux/ directory, as shown in the following example: $ $ $ $ b. mkdir -p build/linux cd src/linux-2.4.21/ tar -cpf - ./ | (cd ../../build/linux; tar -xpf -;) cd ../../ Apply the Lustre patches, as follows. i. Extract the Lustre source files, as follows: $ cd build $ tar -xzpf ../src/lustre-V1.4.tgz $ cd .. ii.
15. Build the interconnect driver trees. • If you are building the HP SFS client kit with support for a Voltaire InfiniBand interconnect, see Section 3.2.2.1; perform Steps 1 through 9 of that section. • For other interconnect types, refer to your interconnect manufacturer's instructions for this task. 16. Build Lustre, as follows: a. Extract the Lustre sources, as follows: $ cd build $ tar -xzpf ../src/lustre-V1.4.tgz $ cd .. b. Configure the Lustre sources.
b. Generate the spec file, by entering the following commands: $ m4 -D_VERSION=2.2 -D_RELEASE=0 -D_LICENSE=commercial -D_URL=http://www.hp.com/go/hptc -D_DISTRIBUTION="%{distribution}" -D_VENDOR="SFS client manual" -D_PACKAGER=put_your_email_address_here -D_HPLS_INSTALL_DIR="/usr/opt/hpls" -D_STANDALONE_BUILD=1 lustreclient.spec.m4 > ../../output/specs/lustre-client.spec c. Copy the hpls-lustre tarball file into the output/src directory, by entering the following command: $ cp hpls-lustre.tar.gz ../..
C.3 Output from the SFS Client Enabler When you build an HP SFS client kit manually, the output directories for the RPM files are as follows: On RHEL systems: • /usr/src/redhat/RPMS/ • /usr/src/redhat/SRPMS/ On SLES 9 systems: • /usr/src/packages/RPMS/ • /usr/src/packages/SRPMS/ If you made any changes to the process described in Section C.2, your output directories may be different. If you created and used the rpmmacros file as described in Step 6 in Section C.
• hpls-diags-client If possible, use the version of the hpls-diags-client package that you built when you created the HP SFS client kit. However, if the package failed to build, you can find the hpls-diagsclient package on the HP StorageWorks Scalable File Share Client Software CD-ROM, in the appropriate directory for your particular client architecture/distribution combination.
Glossary administration server The ProLiant DL server that the administration service runs on. Usually the first server in the system. See also administration service administration service The software functionality that allows you to configure and administer the HP SFS system. See also administration server ARP Address Resolution Protocol. ARP is a TCP/IP protocol that is used to get the physical address of a client node or server.
Internet address A unique 32-bit number that identifies a host’s connection to an Internet network. An Internet address is commonly represented as a network number and a host number and takes a form similar to the following: 192.168.0.1. internet protocol See IP IP Internet Protocol. The network layer protocol for the Internet protocol suite that provides the basis for the connectionless, best-effort packet delivery service. IP includes the Internet Control Message Protocol (ICMP) as an integral part.
Object Storage Server A ProLiant DL server that OST services run on. See also OST service OST service The Object Storage Target software subsystem that provides object services in a Lustre file system. See also Object Storage Server Portals A message passing interface API used in HP SFS versions up to and including Version 2.1-1. Python Python is an interpreted, interactive, object-oriented programming language from the Python Software Foundation (refer to the www.python.org Web site).
Glossary–4
Index C client configurations additional tested configurations 1-6 configurations that do not work with HP SFS 1-9 untested configurations 1-7 client enabler additional steps for InfiniBand interconnect 3-7 building a client kit manually C-1 building a client kit using the sample script 3-5 prerequisites 3-4 client kit additional steps for InfiniBand interconnect 3-7 building manually C-1 building using sample script 3-5 prerequisites for building 3-4 client nodes downgrading Red Hat Enterprise Linux and SU
V viewing file system state information 4-16 X XC client nodes downgrading HP SFS software 2-12 installing HP SFS software 2-2, 3-12 upgrading HP SFS software 2-9 Index–2