HP Caliper 5.3 User Guide (5900-1558, February 2011)

ManualsBrandsHP ManualsSoftwareHP-UX Caliper Software

241

242

243

244

245

246

247

248

249

250

• Exclude the interruption state: --measure-on-interrupts off

• Only measure the interruption state: --measure-on-interrupts only

Metrics Available from this Measurement

The following metrics are available from this event set. These descriptions do not take into account

any command-line options you might use.

The metrics are:

• Raw CPI

The raw CPI is computed using all instructions retired. This includes nops and predicated off

instructions. The relationship between effective and raw CPI values can be obtained from the

cpi measurement.

• Itlb

This counts the number of cycles where there are no back-end stalls or flushes, the decoupling

buffer is empty, and the front end is stalled due to an L1 TLB miss that is serviced either by the

L2 TLB or the HPW if an L2 TLB and the TLB entry is found in somewhere is the cache hierarchy.

This does not count cycles attributable to software TLB miss handling when the HPW fails to

find the requisite translation.

• Icache

This counts the number of cycles where there are no back-end stalls or flushes, the decoupling

buffer is empty, and the front end is stalled due to an instruction cache miss at any level of

the cache hierarchy (L1, L2, L3).

• Branch

This counts the number of stall cycles associated with branch execution. There are two

components to this category. The first is stalls due to execution bubbles caused by a front-end

resteer, that is, a taken branch. The second component is stalls due to the recirculation of

branches while they are waiting for branch history information used in predicting branch

direction.

• Unstall Execute

This is the percentage of cycles when the back end is executing instructions without stalling.

Depending on code characteristics and resource limitations, the number of instructions executing

varies from 1 to 6, which is the maximum dispatch for the Itanium 2 processor. Taken branches,

non-double-bundle aligned branch targets, and explicit stop bits are the primary determinants

of code-based execution limitations. You can obtain some idea of this from the dispersal

event set.

• BE Flush

This counts the number of stall cycles resulting from a pipeline flush caused by a branch

misprediction, an exception, an ALAT flush, or a serialization flush.

• Scoreboard

This counts stall cycles due to dependencies on integer or floating-point operations, floating-point

flushes, and control or application register read or writes.

• L1Dtlb

This counts the number of cycles stalled due to a level 1 data TLB miss that hits in the level 2

data TLB. This is sometimes called a L1DTLB transfer stall. If the level 2 TLB misses, the hardware

page walker (HPW) is invoked to insert the required page into the level 2 TLB, which is then

forwarded to the level 1 data TLB.

• L2Dtlb

This counts the number of cycles stalled due to a level 2 data TLB miss during the time the

HPW is actively attempting to resolve the requested TLB entry. If the entry is not in the cache,

246 Event Set Descriptions for CPU Metrics