HP XC System Software Administration Guide Version 3.2

21 Troubleshooting
This chapter provides information to help you troubleshoot problems with HP XC systems. It
addresses the following topics:
“General Troubleshooting” (page 245)
“Nagios Troubleshooting” (page 247)
“Messages Reported by Nagios” (page 249)
“System Interconnect Troubleshooting” (page 252)
“Improved Availability Issues” (page 260)
“SLURM Troubleshooting” (page 261)
“LSF-HPC Troubleshooting” (page 263)
See also Chapter 20 (page 231) for information on available diagnostic tools that you can use to
locate the source of the failure.
21.1 General Troubleshooting
This section contains general troubleshooting information for HP XC systems.
21.1.1 Cannot Connect to Database During Configuration
At times, especially during the initial configuration or reconfiguration of the system, you might
see the following message:
Can't connect to local MySQL server through socket '/var/lib/mysql/mysql.sock
If you see that message, perform the following steps to restart the database and resolve the
problem:
1. As root on the head node, restart the database:
# service mysqld restart
This command may report that it fails to either stop or restart the MySQL processes. If so,
continue with the remainder of this procedure.
2. Enter the following command to find MySQL processes:
# ps -eaf | grep mysql
Three processes should be listed: grep, mysqld_safe, and mysqld. If you do not see
mysqld_safe and mysqld, proceed to step 4.
3. Use the process ID (PID) of /usr/libexec/mysqld (the number just after the process
owner name) to kill mysqld manually. If the mysqld process is not listed, but there is a
mysqld_safe process, use that PID instead.
# kill mysqld_PID
This process should kill both mysqld and mysqld_safe.
4. Restart the mysqld service:
# service mysqld restart
The command you were trying to invoke should now be able to connect to the database.
21.1 General Troubleshooting 245