Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture

Vol. 1 12-7
PROGRAMMING WITH SSE3 AND SUPPLEMENTAL SSE3
12.3.6 Two Thread Synchronization Instructions
The MONITOR instruction sets up an address range that is used to monitor write-
back-stores.
MWAIT enables a logical processor to enter into an optimized state while waiting for
a write-back-store to the address range set up by MONITOR. MONITOR and MWAIT
require the use of general purpose registers for its input. The registers used by
MONITOR and MWAIT must be initialized properly; register content is not modified by
these instructions.
12.4 WRITING APPLICATIONS WITH SSE3 EXTENSIONS
The following sections give guidelines for writing application programs and oper-
ating-system code that use SSE3 instructions.
12.4.1 Guidelines for Using SSE3 Extensions
The following guidelines describe how to maximize the benefits of using SSE3 exten-
sions:
Ensure that the processor supports SSE3 extensions.
Ensure that your operating system supports SSE/SSE2/SSE3 extensions.
(Operating system support for the SSE extensions implies support for SSE2
extensions, the x87 and SIMD instructions of SSE3 extensions.)
Ensure your operating system supports MONITOR and MWAIT.
Employ the optimization and scheduling techniques described in the Intel® 64
and IA-32 Architectures Optimization Reference Manual (see Section 1.4,
“Related Literature”).
12.4.2 Checking for SSE3 Support
Before an application attempts to use the SIMD subset of SSE3 extensions, the appli-
cation should follow the steps illustrated in Section 11.6.2, “Checking for SSE/SSE2
Support.” Next, use the additional step provided below:
Check that the processor supports the SIMD and x87 SSE3 extensions (if
CPUID.01H:ECX.SSE3[bit 0] = 1). See Example 12-1 for a code example.
Checking support for SSE, SSE2 along with SSE3 allows software flexibility to use
SSE3. To use FISTTP, software can use the step above to detect support for SSE3.
In the initial implementation of MONITOR and MWAIT, these two instructions are
available to ring 0 and conditionally available at ring level greater than 0. Before an