HP XC System Software Installation Guide Version 4.0

14.3 Troubleshooting the Imaging Process
This section describes hints to troubleshoot the imaging process.
System imaging and node configuration information is stored in the following log files:
/hptc_cluster/adm/logs/imaging.log
/var/log/systemimager/rsyncd
/hptc_cluster/adm/logs/startsys.log
Table 14-1 (page 178) lists problems you might encounter as the golden image is being propagated
to client nodes and describes how to diagnose and resolve the problem.
Table 14-1 Diagnosing System Imaging Problems
Possible SolutionHow To DiagnoseSymptom
Verify BIOS settings to ensure that the node is
set to network boot and that the correct
network adapter is at the top of the boot order.
An nconfig starting entry appears
in the imaging.log file.
A node boots to local disk and
runs through the node
configuration phase (nconfigure)
instead of imaging.
Retry the imaging operation.
Verify that the network is functioning
properly.
You can determine when a node
hangs during imaging by
monitoring the imaging.log
file, which is described in “How
To Monitor An Imaging Session”
(page 180). Further inspection can
be done by setting the correct
console parameter in the
/tftpboot/pxelinux.cfg/
default file before booting.
A node hangs while imaging.
Configure the maximum speed by adding
ETHSPEED=n to the kernel command line. If
the reported speed of the network device is
greater than n, imaging proceeds. Setting
ETHSPEED=0 forces imaging to occur
unconditionally.
You can determine when a node
drops out of the imaging process
by monitoring the imaging.log
file.
The reason the node dropped out
might be that the speed of the
node dropped below the
acceptable range.
The ethtool was added to the
imaging environment, and it
queries the speed of the network
connection with the head node
and drops a node from the
imaging process if the speed is
less than 1000 MB per second.
A node is dropped out of the
imaging process.
Ensure that disk is working correctly and is
properly seated in the node.
Identified by monitoring
imaging.log file or watching
the console.
Disk device not found.
Correct the cluster configuration using the
cluster_config utility. Then, you can use
the startsys command to reimage or you
can rerun the nconfigure phase:
# service nconfig nconfigure
Identified by monitoring
imaging.log file. The system
will completely boot, but the node
will not show up as available by
the sinfo command.
The node configuration phase
(nconfig) fails, and the system is
left in single-user mode.
Verify hardware, BIOS, and kernel boot option
settings.
Verified by multiple “starting
imaging” messages in the rsyncd
log file.
A node spontaneously reboots
during imaging.
178 Troubleshooting