HP-UX HB v13.00 Ch-08 - Crash Dumps

HP-UX Handbook Rev 13.00 Page 7 (of 38)
Chapter 08 Crash Dumps
October 29, 2013
executing at the time of HPMC/TOC event. Once the state has been saved, the operating system
continues to dump physical memory to the dump device.
Software crash events
A software crash event occurs when panic() routine is called. This can either be a direct or
indirect panics. For a software crash event, the PDC and PIM are not involved at all. As such, the
first thing that panic() routine does is to save the processor state into the RPB structure. The
panicking processor will also initiate a TOC to other processors, causing them to stop what they
are doing closer to the point where the problem is detected. This is important to allow the cause
of the panic to be identified.
panic() actually calls a leaf routine panic_save_register_state() to save the processor registers
state. So the return pointer (rp) in the RPB structure actually points to the panic() routine. The
instruction address (pcoq) is zeroed out in the RPB to prevent unwinding beyond panic since this
is the point of interest. Since panic_save_register_state() is a leaf routine, the stack pointer (sp)
in the RPB will be the same as that of panic().
For a direct panic, the RPB contains the processor's registers state of the routine which called
panic(). In other words, the RPB contains information closest to the point of failure and in the
same context as the routine was called. Thus dump analysis begins with the RPB for direct
panics.
For an indirect panic, the RPB contains the context of a trap handler and it does not reflect the
value of the registers at the time of the fault. Please see the following diagram. An indirect panic
is usually the result of a trap condition which cannot be resolved by the operating system. The
trap handler needs to save the processor state information before bringing down the system
gracefully with a panic call. The trap handler stores these registers state into a save_state
structure. So for an indirect panic, the save_state structure contains information closest to the
point of failure which triggered the trap condition. Thus dump analysis begins with the
save_state for indirect panics.
After panic() has saved the state, it proceeds to dump physical memory to dump device.
PIM Tombstone
The Process Internal Memory or PIM is a storage area in a processor that is set at the time of an
HPMC, LPMC, Soft Boot, or TOC, and is composed of the architected state save error
parameters, and HVERSION-dependent (ie, processor dependent) regions. The internal structure
of PIM is processor dependant. The PDC_PIM procedure is used to access PIM data.
Different systems have different methods of accessing PIM information. On some systems, there
is a pdcinfo program that allows online retrieval of this PIM data. This can be helpful to retrieve
HPMC tombstone data for analysis. The script in /sbin/init.d/pdcinfo automatically runs pdcinfo
command when HP-UX is booted and saves any tombstones in a file in the directory
/var/tombstones. Up to 100 files can be saved. The file "ts99" is the most current, "ts98" is the
next most current...."ts0" would be the oldest.
From a dump analysis point of view (especially HPMC/TOC), the RPB structure should be a