Intel 64 and IA-32 Architectures Software Developers Manual Volume 3B, System Programming Guide Part 2

Table Of Contents
18-94 Vol. 3
DEBUGGING AND PERFORMANCE MONITORING
Each performance counter is 40-bits wide (see Figure 18-36). The RDPMC instruction
has been enhanced in the Pentium 4 and Intel Xeon processors to allow reading of
either the full counter-width (40-bits) or the low 32-bits of the counter. Reading the
low 32-bits is faster than reading the full counter width and is appropriate in situa-
tions where the count is small enough to be contained in 32 bits.
The RDPMC instruction can be used by programs or procedures running at any privi-
lege level and in virtual-8086 mode to read these counters. The PCE flag in control
register CR4 (bit 8) allows the use of this instruction to be restricted to only programs
and procedures running at privilege level 0.
The RDPMC instruction is not serializing or ordered with other instructions. Thus, it
does not necessarily wait until all previous instructions have been executed before
reading the counter. Similarly, subsequent instructions may begin execution before
the RDPMC instruction operation is performed.
Only the operating system, executing at privilege level 0, can directly manipulate the
performance counters, using the RDMSR and WRMSR instructions. A secure oper-
ating system would clear the PCE flag during system initialization to disable direct
user access to the performance-monitoring counters, but provide a user-accessible
programming interface that emulates the RDPMC instruction.
Some uses of the performance counters require the counters to be preset before
counting begins (that is, before the counter is enabled). This can be accomplished by
writing to the counter using the WRMSR instruction. To set a counter to a specified
number of counts before overflow, enter a 2s complement negative integer in the
counter. The counter will then count from the preset value up to -1 and overflow.
Writing to a performance counter in a Pentium 4 or Intel Xeon processor with the
WRMSR instruction causes all 40 bits of the counter to be written.
18.18.3 CCCR MSRs
Each of the 18 performance counters in a Pentium 4 or Intel Xeon processor has one
CCCR MSR associated with it (see Table 18-26). The CCCRs control the filtering and
counting of events as well as interrupt generation. Figure 18-37 shows the layout of
an CCCR MSR. The functions of the flags and fields are as follows:
Figure 18-36. Performance Counter (Pentium 4 and Intel Xeon Processors)
63
32
Reserved
31 0
Counter
39
Counter