User's Manual

8-66 Vol. 3
MULTIPLE-PROCESSOR MANAGEMENT
8.10.4 MONITOR/MWAIT Instruction
Operating systems usually implement idle loops to handle thread synchronization. In
a typical idle-loop scenario, there could be several “busy loops” and they would use a
set of memory locations. An impacted processor waits in a loop and poll a memory
location to determine if there is available work to execute. The posting of work is
typically a write to memory (the work-queue of the waiting processor). The time for
initiating a work request and getting it scheduled is on the order of a few bus cycles.
From a resource sharing perspective (logical processors sharing execution
resources), use of the HLT instruction in an OS idle loop is desirable but has implica-
tions. Executing the HLT instruction on a idle logical processor puts the targeted
processor in a non-execution state. This requires another processor (when posting
work for the halted logical processor) to wake up the halted processor using an inter-
processor interrupt. The posting and servicing of such an interrupt introduces a delay
in the servicing of new work requests.
In a shared memory configuration, exits from busy loops usually occur because of a
state change applicable to a specific memory location; such a change tends to be
triggered by writes to the memory location by another agent (typically a processor).
MONITOR/MWAIT complement the use of HLT and PAUSE to allow for efficient parti-
tioning and un-partitioning of shared resources among logical processors sharing
physical resources. MONITOR sets up an effective address range that is monitored for
write-to-memory activities; MWAIT places the processor in an optimized state (this
may vary between different implementations) until a write to the monitored address
range occurs.
In the initial implementation of MONITOR and MWAIT, they are available at CPL = 0
only.
Both instructions rely on the state of the processor’s monitor hardware. The monitor
hardware can be either armed (by executing the MONITOR instruction) or triggered
(due to a variety of events, including a store to the monitored memory region). If
upon execution of MWAIT, monitor hardware is in a triggered state: MWAIT behaves
as a NOP and execution continues at the next instruction in the execution stream.
The state of monitor hardware is not architecturally visible except through the
behavior of MWAIT.
Multiple events other than a write to the triggering address range can cause a
processor that executed MWAIT to wake up. These include events that would lead to
voluntary or involuntary context switches, such as:
External interrupts, including NMI, SMI, INIT, BINIT, MCERR, A20M#
Faults, Aborts (including Machine Check)
Architectural TLB invalidations including writes to CR0, CR3, CR4 and certain MSR
writes; execution of LMSW (occurring prior to issuing MWAIT but after setting the
monitor)
Voluntary transitions due to fast system call and far calls (occurring prior to
issuing MWAIT but after setting the monitor)