HP-UX HB v13.00 Ch-11 - Software Development

HP-UX Handbook Rev 13.00 Page 76 (of 101)
Chapter 11 Software Development
October 29, 2013
Program Aborts
There can be an infinite number of reasons why a program aborts abnormally. We can
distinguish between two types of aborts:
The program encounters an error situation and stops by its own means. In this case it
should print an (hopefully self-explaining) error message.
The system (the kernel) encounters an error while executing the program, and aborts it by
sending it a signal. Depending on the signal (see signal(5)) the system might write a
core file, which is an image of the private memory of the process. The core file can be
analyzed with a debugger.
Core file analysis can be a very complex work, and should normally be done by the program
developers, as it is they who know their programs best. This shall not be explained in-depth here,
only a basic overview will be given, and a few general problems plus hints how to recognize and
solve them will be discussed.
If the problem is reproducible, it often is of interest to see what happens before the abort. In that
case, tracing the program’s system calls with tusc might be helpful.
If there is no core file, there is still a chance to debug the problem, if it is reproducible. Run the
program under the debugger. You must somehow manage to stop process execution immediately
before it aborts. If we don't stop it, the program will terminate and will leave nothing to debug. If
it receives a signal, the debugger will stop its execution automatically.
There are not many ways for a program to terminate. If it does not abort with a signal, it most
probably calls exit(2). There is a good chance to catch the abort by setting a breakpoint at
exit(2). At this point you can do the same analysis as with a program that dumped a core file.
Getting Usable Core files
There are a number of things that can interfere with the writing of a core file. The result can
either be a truncated or corrupted core file, or no core file at all. Truncated core files can be
caused by the following:
The limit for the maximum size of core files, ulimit c (see sh-posix(1)), is too
small.
There is not enough disk space available to write the core file.
The file system’s largefiles option not enabled to write core files > 2 GB.
To check if a core file is complete, use what(1):
$ what core
core: