HP Caliper 5.3 User Guide (5900-1558, February 2011)

This prevents the compiler from reordering statements while optimizing code, so the measured
program results may be worse than it would be otherwise. For example, with sample points
inside of a loop, this could mean that loop invariant promotion or other loop transformations
become illegal or less effective. For sample points placed at the entrance and exit of functions,
this could affect performance if the function is inlined.
Unfortunately, the only way to check for such issues is to check the code generated by the
compiler with and without those macros, and estimate whether the program measurements
are significantly affected.
Restricting PMU Measurements to Specific Code Regions
By default, HP Caliper measures PMU events for your entire program. You can, however, restrict
measurements to performance-sensitive regions of code. This feature is enabled with the
CALIPER_PMU_ENABLE and CALIPER_PMU_DISABLE macros and the --user-regions
option.
You can use this feature with these measurements:
alat
branch
dcache
dtlb
ecount
fprof
icache
itlb
pmu_trace
scgprof
While you can also use this feature with the cgprof measurement, it might lead to inconsistent
results. This is because the time statistics are collected using the PMU, while the call graph and
function counts are collected using dynamic instrumentation.
Reasons to use this feature include:
Analyzing a particular loop or function. You can restrict measurements to a particular loop
to get information such as:
ecount Number of events occurring in the loop
fprof Hot spots in the loop
branch Analysis of the loop branches
dcache Data cache misses in the loop
Analyzing a particular phase in an application.
For applications with important startup or shutdown phases, it is sometimes beneficial to limit
measurements to the “in-between” phase. This technique allows you to use test cases that run
for a shorter time, without having to worry about the effects caused by the startup and shutdown
code.
Similarly, the data collection can be restricted to the startup or shutdown phases to target
those for performance improvements.
Taking PMU Samples in Your Code 161