Intel 64 and IA-32 Architectures Software Developers Manual Volume 3B, System Programming Guide Part 2

Table Of Contents
A-16 Vol. 3
PERFORMANCE-MONITORING EVENTS
49H 40H DTLB_MISSES.PDP_M
ISS
Number of DTLB misses where the
high part of the linear to physical
address translation was missed.
49H 80H DTLB_MISSES.LARGE
_WALK_COMPLETED
Counts number of completed large
page walks due to misses in the
STLB.
4BH 01H SSE_MEM_EXEC.NTA Counts number of SSE NTA
prefetch/weakly-ordered
instructions which missed the L1
data cache.
4BH 08H SSE_MEM_EXEC.STR
EAMING_STORES
Counts number of SSE non-
temporal stores
4CH 01H LOAD_HIT_PRE Counts load operations sent to the
L1 data cache while a previous SSE
prefetch instruction to the same
cache line has started prefetching
but has not yet finished.
4DH 01H SFENCE_CYCLES Counts store fence cycles
4EH 01H L1D_PREFETCH.REQ
UESTS
Counts number of hardware
prefetch requests dispatched out of
the prefetch FIFO.
4EH 02H L1D_PREFETCH.MISS Counts number of hardware
prefetch requests that miss the
L1D. There are two prefetchers in
the L1D. A streamer, which
predicts lines sequentially after this
one should be fetched, and the IP
prefetcher that remembers access
patterns for the current instruction.
The streamer prefetcher stops on
an L1D hit, while the IP prefetcher
does not.
Table A-2. Non-Architectural Performance Events In the Processor Core for Intel Core
i7 Processors
Event
Num.
Umask
Value
Event Mask
Mnemonic Description Comment