User Service Guide, Fifth Edition - HP Integrity rx8640 SEU
Table 4-11 Cooling Troubleshooting
ActionSymptom
When troubleshooting cooling, LED indications, the MP "ps"
command, and the Error Log are your most useful tools.
Cooling problems are most often isolated to a failed fan or a high
ambient room temperature. Visual inspections for LED indications
and the use of the "ps" command should pinpoint the failing FRU.
Also inspect the server for blocked input air accesses.
Common cooling failure indications:- System
will not power up- Green fan LED may not
be lighted- Error log indicates a failed fan
Backplane (fabric) Failures
Backplane problems are one of the more difficult problems to troubleshoot. The crossbar
complexity makes it difficult to narrow the scope of a failure to a specific FRU. Failure possibilities
may include a cell, a connector, a failing XBC, or a failing I/O response. Only familiarity with
the backplane's XBC port (logical-physical) crossbar technology and your ability to decode fabric
errors will allow troubleshooting success with backplane (fabric) errors. If you are unfamiliar
with the crossbar technology, it strongly suggested that you contact WTEC for assistance. Reading
the Error Log, and using the MCA Analyzer are your most valuable tools.
The new sx2000 backplane supports up to 4Cells, interconnected via the crossbar links. A sustained
total bandwidth of 25.5 GBytes/s is provided to each Cell. Each Cell connects to XBC ASICs that
enables communicating with other Cells in the server.
The only fatal errors on XBCs are link errors detected in the sx2000 Link Block (ALB). When the
ALB cannot recover from a link error, it logs the error, completes the remainder of the packet
with poisoned micropackets, and brings the link down. If the error occurs in the header, ALB
will try to correct 1-bit errors. Double-bit errors in the header are uncorrectable and hence fatal.
When a link is powered-down, the XBC will invalidate the route table, and empty all inport
FIFOs of pending transactions. Transactions destined to the powered-down link from other ports
will be accepted and inter credits will be returned; these transactions will be marked invalid.
Table 4-12 Backplane Troubleshooting
ActionSymptom
Search the error logs for fatal (link) errors.
Try visual inspections for LED (power)/ (attention) indications.
Use the "ps" command to verify cell status.
Apparent backplane problems may actually be a failing Cell, XBC,
REO cables or IO FRU. Use logs and/or diagnostics to better pinpoint
the failing FRU...avoid changing several FRUs.
Backplane failure indications:- BPB green LED
may not be lighted.- Error log indicates an
error - read the log with the MP.
Utility Subsystem Failures
The Utility subsystem is primarily composed of the Maintenance Processor (MP). Your best
troubleshooting tools are visual interpretation of LED indications and the use of the Error Log
- if it is accessible. If the MP fails, the following functions are lost:
- The ability to process and store log entries.- Console functions to every partition.- OL* functions.-
Virtual front panel and system alert notification.- The ability to connect to the MP for
maintenance.- The ability to run diagnostics (ODE and scan).
Troubleshooting Aids 91