Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture

10-2 Vol. 1
PROGRAMMING WITH STREAMING SIMD EXTENSIONS (SSE)
Instructions that support explicit prefetching of data, control of the cacheability
of data, and control the ordering of store operations.
Extensions to the CPUID instruction.
These features extend the IA-32 architecture’s SIMD programming model in four
important ways:
The ability to perform SIMD operations on four packed single-precision floating-
point values enhances the performance of IA-32 processors for advanced media
and communications applications that use computation-intensive algorithms to
perform repetitive operations on large arrays of simple, native data elements.
The ability to perform SIMD single-precision floating-point operations in XMM
registers and SIMD integer operations in MMX registers provides greater
flexibility and throughput for executing applications that operate on large arrays
of floating-point and integer data.
Cache control instructions provide the ability to stream data in and out of XMM
registers without polluting the caches and the ability to prefetch data to selected
cache levels before it is actually used. Applications that require regular access to
large amounts of data benefit from these prefetching and streaming store
capabilities.
The SFENCE (store fence) instruction provides greater control over the ordering
of store operations when using weakly-ordered memory types.
SSE extensions are fully compatible with all software written for IA-32 processors. All
existing software continues to run correctly, without modification, on processors that
incorporate SSE extensions. Enhancements to CPUID permit detection of SSE exten-
sions. SSE extensions are accessible from all IA-32 execution modes: protected
mode, real address mode, and virtual-8086 mode.
The following sections of this chapter describe the programming environment for SSE
extensions, including: XMM registers, the packed single-precision floating-point data
type, and SSE instructions. For additional information, see:
Section 11.6, “Writing Applications with SSE/SSE2 Extensions”.
Section 11.5, “SSE, SSE2, and SSE3 Exceptions,” describes the exceptions that
can be generated with SSE/SSE2/SSE3 instructions.
Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volumes
2A & 2B, provide a detailed description of these instructions.
Chapter 12, “System Programming for Streaming SIMD Instruction Sets,” in the
Intel® 64 and IA-32 Architectures Software Developer’s Manual, Volume 3A,
gives guidelines for integrating these extensions into an operating-system
environment.