Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture
Vol. 1 12-1
CHAPTER 12
PROGRAMMING WITH
SSE3 AND SUPPLEMENTAL SSE3
The Pentium
4 processor supporting Hyper-Threading Technology introduces
Streaming SIMD Extensions 3 (SSE3). The Intel Xeon processor 5100 series, Intel
Core 2 processor families introduced Supplemental Streaming SIMD Extensions 3
(SSSE3). This chapter describes SSE3/SSSE3 and provides information to assist in
writing application programs that use these extensions.
12.1 SSE3/SSSE3 PROGRAMMING ENVIRONMENT AND
DATA TYPES
The programming environment for using SSE3/SSSE3 is unchanged from that shown
in Figure 3-1 and Figure 11-1. SSE3/SSSE3 do not introduce new data types. XMM
registers are used to operate on packed integer data, single-precision floating-point
data, or double-precision floating-point data.
One SSE3 instruction uses the x87 FPU for x87-style programming. There are two
SSE3 instructions that use the general registers for thread synchronization. The
MXCSR register governs SIMD floating-point operations. Note, however, that the
x87FPU control word does not affect the SSE3 instruction that is executed by the
x87 FPU (FISTTP), other than by unmasking an invalid operand or inexact result
exception.
12.1.1 SSE3/SSSE3 in 64-Bit Mode and Compatibility Mode
In compatibility mode, SSE3/SSSE3 function like they do in protected mode. In
64-bit mode, eight additional XMM registers are accessible. Registers XMM8-XMM15
are accessed by using REX prefixes.
Memory operands are specified using the ModR/M, SIB encoding described in Section
3.7.5.
Some SSE3 instructions may be used to operate on general-purpose registers. Use
the REX.W prefix to access 64-bit general-purpose registers. Note that if a REX prefix
is used when it has no meaning, the prefix is ignored.