Intel 64 and IA-32 Architectures Software Developers Manual Volume 3B, System Programming Guide Part 2

Table Of Contents
Vol. 3 18-125
DEBUGGING AND PERFORMANCE MONITORING
There are several ways to count processor clock cycles to monitor performance.
These are:
Non-halted clockticks — Measures clock cycles in which the specified logical
processor is not halted and is not in any power-saving state. When Intel Hyper-
Threading Technology is enabled, ticks can be measured on a per-logical-
processor basis. There are also performance events on dual-core processors that
measure clockticks per logical processor when the processor is not halted.
Non-sleep clockticks Measures clock cycles in which the specified physical
processor is not in a sleep mode or in a power-saving state. These ticks cannot be
measured on a logical-processor basis.
Time-stamp counter Measures clock cycles in which the physical processor is
not in deep sleep. These ticks cannot be measured on a logical-processor basis.
Reference clockticks TM2 or Enhanced Intel SpeedStep technology are two
examples of processor features that can cause processor core clockticks to
represent non-uniform tick intervals due to change of bus ratios. Performance
events that counts clockticks of a constant reference frequency was introduced
Intel Core Duo and Intel Core Solo processors. The mechanism is further
enhanced on processors based on Intel Core microarchitecture.
Some processor models permit clock cycles to be measured when the physical
processor is not in deep sleep (by using the time-stamp counter and the RDTSC
instruction). Note that such ticks cannot be measured on a per-logical-processor
basis. See Section 18.11, “Time-Stamp Counter,” for detail on processor capabilities.
The first two methods use performance counters and can be set up to cause an inter-
rupt upon overflow (for sampling). They may also be useful where it is easier for a
tool to read a performance counter than to use a time stamp counter (the timestamp
counter is accessed using the RDTSC instruction).
For applications with a significant amount of I/O, there are two ratios of interest:
Non-halted CPI Non-halted clockticks/instructions retired measures the CPI
for phases where the CPU was being used. This ratio can be measured on a
logical-processor basis when Intel Hyper-Threading Technology is enabled.
Nominal CPI Time-stamp counter ticks/instructions retired measures the CPI
over the duration of a program, including those periods when the machine halts
while waiting for I/O.
18.20.1 Non-Halted Clockticks
Use the following procedure to program ESCRs and CCCRs to obtain non-halted
clockticks on processors based on Intel NetBurst microarchitecture:
1. Select an ESCR for the global_power_events and specify the RUNNING sub-event
mask and the desired T0_OS/T0_USR/T1_OS/T1_USR bits for the targeted
processor.