Intel 64 and IA-32 Architectures Software Developers Manual Volume 3B, System Programming Guide Part 2

Table Of Contents
18-116 Vol. 3
DEBUGGING AND PERFORMANCE MONITORING
The four separate tag bits allow the user to simultaneously but distinctly count up to
four execution events at retirement. (This applies for non-precise event-based
sampling. There are additional restrictions for PEBS as noted in Section 18.18.8.3,
“Setting Up the PEBS Buffer.) It is also possible to detect or count combinations of
events by setting multiple tag value bits in the upstream ESCR or multiple mask bits
in the downstream ESCR. For example, use a tag value of 3H in the upstream ESCR
and use NBOGUS0/NBOGUS1 in the downstream ESCR event mask.
18.18.7.4 Tagging Mechanism for Replay_event
Table A-10 describes the Replay_event and Table A-14 describes metrics that are
used to set up an Replay_event count.
The replay mechanism enables tagging of μops for a subset of all replays before
retirement. Use of the replay mechanism requires selecting the type of μop that may
experience the replay in the MSR_PEBS_MATRIX_VERT MSR and selecting the type of
event in the MSR_PEBS_ENABLE MSR. Replay tagging must also be enabled with the
UOP_Tag flag (bit 24) in the MSR_PEBS_ENABLE MSR.
The Table A-14 lists the metrics that are support the replay tagging mechanism and
the at-retirement events that use the replay tagging mechanism, and specifies how
the appropriate MSRs need to be configured. The replay tags defined in Table A-5
also enable Precise Event-Based Sampling (PEBS, see Section 15.9.8). Each of these
replay tags can also be used in normal sampling by not setting Bit 24 nor Bit 25 in
IA_32_PEBS_ENABLE_MSR. Each of these metrics requires that the Replay_Event
(see Table A-10) be used to count the tagged μops.
18.18.8 Precise Event-Based Sampling (PEBS)
The debug store (DS) mechanism in processors based on Intel NetBurst microarchi-
tecture allow two types of information to be collected for use in debugging and tuning
programs: PEBS records and BTS records. See Section 18.7.8, “Branch Trace Store
(BTS),” for a description of the BTS mechanism.
PEBS permits the saving of precise architectural information associated with one or
more performance events in the precise event records buffer, which is part of the DS
save area (see Section 18.18.5, “DS Save Area”). To use this mechanism, a counter
is configured to overflow after it has counted a preset number of events. After the
counter overflows, the processor copies the current state of the general-purpose and
EFLAGS registers and instruction pointer into a record in the precise event records
buffer. The processor then resets the count in the performance counter and restarts
the counter. When the precise event records buffer is nearly full, an interrupt is
generated, allowing the precise event records to be saved. A circular buffer is not
supported for precise event records.
PEBS is supported only for a subset of the at-retirement events: Execution_event,
Front_end_event, and Replay_event. Also, PEBS can only be carried out using the
one performance counter, the MSR_IQ_COUNTER4 MSR.