HP Caliper User Guide Release 5.5 (5900-2351, August 2012)

Table 26 Information in fprof Measurement Reports
DescriptionColumn
Percent of the total IP samples attributable to a given program object.% Total IP Samples
Running sum of the percent of total IP samples accounted for by the given program object and
those listed above it.
Cumulat % of Total
Total number of IP samples attributed to the given program object.IP Samples
Kernel Thread ID suffixed with the the name of the routine that the thread will execute once it
is created.
Kernel Thread
Identification Number
Shared library or the main executable.Load Module
Routine from your application.Function
Source file associated with a function.File
The column contains one of these:Line |
Slot |
Col,Offset
A source-code line number for rows showing statements
An instruction slot number for rows showing instructions not on a bundle boundary
A source-code column number followed by an offset from the beginning address of a function
for rows showing instructions on a bundle boundary
Column and line numbers are preceded by “~” when they are approximate due to optimization.
The column contains either a source statement preceded by “>” or a disassembled instruction.
Statements that are out of order due to optimization are preceded by “*>”.
>Statement |
Instruction
How fprof Metrics Are Obtained
HP Caliper obtains fprof metrics using the performance monitoring unit (PMU).
Exact counts are obtained from the PMU's performance monitor configuration (PMC)/performance
monitor data (PMD) register pairs. Sampled IPs are obtained from the operating system.
HP Caliper takes samples by using the overflow of one of the PMU's event counters as a sampling
trigger. Samples are taken every Nth PMU event, where both N and the sampling event are defined
in the fprof measurement configuration file in the HP Caliper home directory in the config
subdirectory. You can override the value in the measurement configuration file by using the -s
option.
The list of processor metrics you can use for the sampling event are available from the file
itanium2_cpu_counters.txt, located in the HP Caliper home directory in the doc/text
subdirectory.
The IP collected at each sampling point is the IP recorded by the kernel (in the process's save state)
when the PMU overflow trap is taken. The kernel does not record a instruction slot number. Thus,
the lowest granularity HP Caliper reports is instruction bundles.
The IP that HP Caliper records is the address of the next instruction that will execute when the kernel
resumes execution of your application. It is not the address of the instruction that caused the event
that resulted in the PMU overflow trap. This is because of the delays associated with incrementing
the PMU counter, detecting the overflow, and triggering the trap. This means that the instruction
that caused the PMU overflow will have occurred some number of cycles, typically in the low tens,
before the address being sampled. Thus, the address recorded might or might not point to the
instruction causing the event, depending on pipeline stalls.
The latency between the event triggering the sample and the actual sample is not a problem if you
are using fprof to find hot spots in your application. It is only an issue if you try to use fprof
to find particular instructions that cause the events recorded by the PMU, in which case you must
take the latency into account.
212 Descriptions of Measurement Reports