HP XC System Software Administration Guide Version 3.2

IMPORTANT: The fw_ver parameter indicates the firmware version. The InfiniBand board
firmware should be the latest version available with your software release, and must be at
least as recent as the minimum firmware versions listed in the HP XC master firmware list:
http://www.docs.hp.com/en/linuxhpc.html
When examining the ibv_devinfo command output, you should see a PORT_ACTIVE
state indication for at least one port of the InfiniBand board. If you see PORT_DOWN or
PORT_INITIALIZE indication, this means that the InfiniBand board is not communicating
properly with the InfiniBand switch. This could be due to a missing cable, or poor cable
connection to the switch, that the InfiniBand switch is not functioning correctly, or that an
InfiniBand Subnet Manager is not running properly. Troubleshoot the cables or switch as
necessary using information in the InfiniBand vendor's documentation, as well as HP
InfiniBand Hardware documentation.
If you see no output at all, or if you see an error message, it is possible that the InfiniBand
stack or kernel is improperly installed, see the following troubleshooting steps starting from
#4 below to verify their proper installation.
3. Run the ibstatus command to verify that the InfiniBand system interconnect is connected
and operating correctly:
[root@n1 ~]# ibstatus
Infiniband device 'mthca0' port 1 status:
default gid: fe80:0000:0000:0000:0017:08ff:ffd1:33b5
base lid: 0x3
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 20 Gb/sec (4X DDR)
Examine this output for the connection speed, as reported by the rate parameter. If your
InfiniBand card, cables, and switch are double data rate (DDR) capable, the rate parameter
appends the transfer speed with 4X DDR. If that is not reported, troubleshoot the InfiniBand
board, cables, and switch as necessary until the board is reporting that it is running at DDR
speed. Sometimes rebooting a node running at single data rate (SDR) speed enables it to
negotiate to DDR speed.
4. Use the uname command to make sure that you are running an HP XC kernel. The HP XC
kernels are identified by the presence of XC in the kernel name:
[root@n1 ~]# uname -a
Linux n1 2.6.9-42.9hp.XCsmp #1 SMP date and time
x86_64 x86_64 x86_64 GNU/Linux
5. Ensure that the OFED InfiniBand RPMs are installed:
# rpm -q -a
...
dapl-1.2.0-0
dapl-devel-1.2.0-0
ibutils-1.0-0
kernel-ib-1.1-2.6.9_42.9hp.XCsmp
kernel-ib-devel-1.1-2.6.9_42.9hp.XCsmp
libibcm-0.9.0-0
libibcm-devel-0.9.0-0
libibcommon-1.0-0
libibcommon-devel-1.0-0
libibmad-1.0-0
libibmad-devel-1.0-0
libibumad-1.0-0
libibumad-devel-1.0-0
libibverbs-1.0.4-0
libibverbs-devel-1.0.4-0
258 Troubleshooting