Intel 64 and IA-32 Architectures Software Developers Manual Volume 3B, System Programming Guide Part 2

Table Of Contents
A-62 Vol. 3
PERFORMANCE-MONITORING EVENTS
04H 08HSTORE_BLOCK.
SNOOP
A store is
blocked due to
a conflict with
an external or
internal snoop.
This event counts the number of cycles
the store port was used for snooping the
L1 data cache and a store was stalled by
the snoop. The store is typically
resubmitted one cycle later.
06H 00H SEGMENT_REG_
LOADS
Number of
segment
register loads
This event counts the number of segment
register load operations. Instructions that
load new values into segment registers
cause a penalty.
This event indicates performance issues in
16-bit code. If this event occurs
frequently, it may be useful to calculate
the number of instructions retired per
segment register load. If the resulting
calculation is low (on average a small
number of instructions are executed
between segment register loads), then the
code’s segment register usage should be
optimized.
As a result of branch misprediction, this
event is speculative and may include
segment register loads that do not
actually occur. However, most segment
register loads are internally serialized and
such speculative effects are minimized.
07H 00H SSE_PRE_EXEC.
NTA
Streaming SIMD
Extensions
(SSE) Prefetch
NTA
instructions
executed
This event counts the number of times the
SSE instruction prefetchNTA is executed.
This instruction prefetches the data to the
L1 data cache.
07H 01H SSE_PRE_EXEC.L1 Streaming SIMD
Extensions
(SSE)
PrefetchT0
instructions
executed
This event counts the number of times the
SSE instruction prefetchT0 is executed.
This instruction prefetches the data to the
L1 data cache and L2 cache.
Table A-6. Non-Architectural Performance Events
in Processors Based on Intel Core Microarchitecture (Contd.)
Event
Num
Umask
Value Event Name Definition
Description and
Comment