HP Caliper User Guide Release 5.5 (5900-2351, August 2012)

Source statement
Instruction
Table 21 Information in dcache Measurement Reports
DescriptionColumn
Total cache miss latency cycles, expressed as a percent of the total cycles.% Total Dcache
Latency Cycles
For example, in Example 5, 84.29 percent of the total cycles were expended on data cache
misses.
Total number of sampled L1 (or FLD) data cache accesses attributed to the given program object.
Enabled only on Intel® Itanium® 9500 series processors.
Sampled Dcache
Hits
The percentage of cycles spent on the data cache misses as compared to the total dcache cycles(hits
and misses combined).
Sampled Dcache
Miss Rate
Total number of sampled data cache misses attributed to the given program object.Sampled Dcache
Misses
Number of cycles expended on data cache misses summed across samples for the given program
object.
Dcache Latency
Cycles
Average number of cycles expended on data cache misses across samples for the given program
object.
Avg. Dcache
Laten. Cycles
The latency data is reported under eight different buckets: three for cache information and five for
memory information.
Latency Buckets as
% Misses
The top row(s) of the heading specifies the names of the cache level (such as L2 or L3) and system
memory names. For example, in Example 5, cache levels L2 and L3 are shown and the system
memory is shown as simply Memory (spanning five buckets).
The system memory buckets vary depending on whether your system is a low-end server,
direct-connected cell system, or Superdome server. Possible memory bucket headings are:
Memory: System memory access
loc c2c: Local cache-to-cache (C2C) transactions between CPUs in the same front side bus (FSB)
(Superdome Integrity server only)
loc memory: Cell local memory access (Superdome Integrity server only)
1 hop: Remote memory access that is one hop across the crossbar (Superdome Integrity server
only)
2 hop: Remote memory access that is two hops across the crossbar (Superdome Integrity server
only)
1&2 c2c: One- or two-hop cache-to-cache (C2C) remote memory access (Superdome Integrity
server only)
The last row of the headings specifies the latency value in cycles. For example, in Example 5, the
L2 data cache has a latency of 7 cycles. All data cache misses with latency of less than or equal
to 7 cycles are grouped under the L2 bucket.
The buckets in Example 5 under L3 have latencies of 14 and 64 cycles. The bucket under 14
captures all latencies greater than 7 cycles and less than or equal to 14 cycles. The bucket under
64 captures latencies that are greater than 14 cycles and less than or equal to 64 cycles.
On an rx4640 Integrity server, as shown in Example 5, the last five buckets capture misses that
are serviced from system memory. The first bucket under the Memory heading captures latencies
that are greater than 64 cycles and less than or equal to 150 cycles. The last bucket captures all
latencies greater than 450 cycles.
The reported values are the percentage of sampled dcache misses in the specified latency range.
For example, in Example 5, in the Function Totals column, the value of 92 in the L2 data cache
means that 92 percent of all data misses in the function goo are satisfied by the L2 cache. Similarly,
the value in the first bucket under the Memory heading means that 5 percent of the misses were
having latencies in the range greater than 64 and less than or equal to 150 cycles.
If you turn off the latency bucket information by using the --latency-buckets False option,
the information in the Latency Buckets as % Misses column is not displayed.
dcache Measurement Report Description 195