Intel 64 and IA-32 Architectures Software Developers Manual Volume 3A, System Programming Guide, Part 1

2-30 Vol. 3A
SYSTEM ARCHITECTURE OVERVIEW
introduced with the Pentium Pro processor). If any non-wake events are pending
during shutdown, they will be handled after the wake event from shutdown is
processed (for example, A20M# interrupts).
The LOCK prefix invokes a locked (atomic) read-modify-write operation when modi-
fying a memory operand. This mechanism is used to allow reliable communications
between processors in multiprocessor systems, as described below:
In the Pentium processor and earlier IA-32 processors, the LOCK prefix causes
the processor to assert the LOCK# signal during the instruction. This always
causes an explicit bus lock to occur.
In the Pentium 4, Intel Xeon, and P6 family processors, the locking operation is
handled with either a cache lock or bus lock. If a memory access is cacheable and
affects only a single cache line, a cache lock is invoked and the system bus and
the actual memory location in system memory are not locked during the
operation. Here, other Pentium 4, Intel Xeon, or P6 family processors on the bus
write-back any modified data and invalidate their caches as necessary to
maintain system memory coherency. If the memory access is not cacheable
and/or it crosses a cache line boundary, the processors LOCK# signal is asserted
and the processor does not respond to requests for bus control during the locked
operation.
The RSM (return from SMM) instruction restores the processor (from a context
dump) to the state it was in prior to an system management mode (SMM) interrupt.
2.6.6 Reading Performance-Monitoring and Time-Stamp Counters
The RDPMC (read performance-monitoring counter) and RDTSC (read time-stamp
counter) instructions allow application programs to read the processor’s perfor-
mance-monitoring and time-stamp counters, respectively. Pentium 4 and Intel Xeon
processors have eighteen 40-bit performance-monitoring counters; P6 family
processors have two 40-bit counters.
Use these counters to record either the occurrence or duration of events. Events that
can be monitored are model specific; they may include the number of instructions
decoded, interrupts received, or the number of cache loads. Individual counters can
be set up to monitor different events. Use the system instruction WRMSR to set up
values in the one of the 45 ESCRs and one of the 18 CCCR MSRs (for Pentium 4 and
Intel Xeon processors); or in the PerfEvtSel0 or the PerfEvtSel1 MSR (for the P6
family processors). The RDPMC instruction loads the current count from the selected
counter into the EDX:EAX registers.
The time-stamp counter is a model-specific 64-bit counter that is reset to zero each
time the processor is reset. If not reset, the counter will increment ~9.5 x 10
16
times per year when the processor is operating at a clock rate of 3GHz. At this
clock frequency, it would take over 190 years for the counter to wrap around. The
RDTSC instruction loads the current count of the time-stamp counter into the
EDX:EAX registers.