HP Caliper 5.3 User Guide (5900-1558, February 2011)

(McKinley, Madison, and Deerfield), this should be approximately equal to the L1I cache fill
rate.
%ISB Line Usage
This is the percentage of ISB lines that are actually delivered to the L1I cache. For the Itanium
2 family of processors (McKinley, Madison, and Deerfield), this fraction will be at or slightly
less than 100%.
%Miss - All
This is the percentage of the total misses (instruction demand fetch misses and instruction
prefetch misses) out of the total number of L1 instruction accesses (instruction demand fetch
and instruction prefetch). The prefetches include both streaming and non-streaming prefetches.
%Miss - Dfetch
This is the percentage of the number of demand instruction fetch misses out of the total instruction
demand fetch accesses.
%Miss - Pfetch
This is the percentage of the number of instruction prefetch misses out of the total number of
instruction prefetch requests. The instruction prefetch count includes streaming and non-streaming
prefetches.
l2cache Event Set
The l2cache event set provides miss rate information for the L2 unified cache on Itanium 2 systems.
This measurement is valid only on Itanium 2 systems. On dual-core Itanium 2 and Itanium 9300
quad-core processor systems, the event set name l2cache will produce the l2dcache and
l2icache metrics.
The L2 cache metrics include miss information for instruction prefetch requests, instruction demand
requests, integer loads that miss the L1 cache, memory operations not handled by the L1 cache
(that is, integer stores), lfetch instructions, and floating-point load/store operations.
There are a number of issues regarding L2 cache access that need to be considered when
interpreting L2 cache measurement results. The L2 cache will not count fetches to the second half
of a line if the fetch for the first part is already counted. Secondary misses are counted as data
references, and semaphore operations are counted as a single atomic operation. Only requests
that have entered the OZ queue are counted. And these instructions are not counted: FROM_CCV,
SETF, PTC_G, FWB, MF, MFA, SYNCI, SYNCIA, PTCM, FC, and CC.
If you use this event set, the default is to make the measurements irrespective of CPU operating
state (that is, user, system, or interrupt states). By default, the idle state is not included in the
measurement. You can use command-line options to limit the scope of the measurement. Specifically,
you can:
Limit measurement to a specific privilege level: -m event_set[:all|user|kernel]
Include idle: --exclude-idle False
Exclude the interruption state: --measure-on-interrupts off
Only measure the interruption state: --measure-on-interrupts only
The event per kinst (event per 1000 instructions) metrics are computed using all instructions retired.
This includes nops, predicated off instructions, failed speculation and instructions and associated
recovery code as well as the architecturally visible instruction. You can eliminate idle loops effects
by using the command-line option --exclude-idle True (which is the default). The effects of
failed speculative operations and TLB misses cannot be directly eliminated, but you can get an
estimate of the impact of events from the cspec, dspec, and tlb event sets. You can use the
cpi event set to obtain the fraction of all instructions retired that have an architecturally visible
result, except for predicated off branches, which are counted as useful instructions (non-taken
branch) by the Itanium 2 PMU.
l2cache Event Set 235