Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture
Vol. 1 10-9
PROGRAMMING WITH STREAMING SIMD EXTENSIONS (SSE)
a scalar single-precision floating-point value into a doubleword integer (see
Figure 11-8).
SSE extensions provide conversion instructions between XMM registers and MMX
registers, and between XMM registers and general-purpose bit registers. See
Figure 11-8.
The address of a 128-bit packed memory operand must be aligned on a 16-byte
boundary, except in the following cases:
• The MOVUPS instruction supports unaligned accesses.
• Scalar instructions that use a 4-byte memory operand that is not subject to
alignment requirements.
Figure 4-2 shows the byte order of 128-bit (double quadword) data types in memory.
10.4 SSE INSTRUCTION SET
SSE instructions are divided into four functional groups
• Packed and scalar single-precision floating-point instructions
• 64-bit SIMD integer instructions
• State management instructions
• Cacheability control, prefetch, and memory ordering instructions
The following sections give an overview of each of the instructions in these groups.
10.4.1 SSE Packed and Scalar Floating-Point Instructions
The packed and scalar single-precision floating-point instructions are divided into the
following subgroups:
• Data movement instructions
• Arithmetic instructions
• Logical instructions
• Comparison instructions
• Shuffle instructions
• Conversion instructions
The packed single-precision floating-point instructions perform SIMD operations on
packed single-precision floating-point operands (see Figure 10-5). Each source
operand contains four single-precision floating-point values, and the destination
operand contains the results of the operation (OP) performed in parallel on the corre-
sponding values (X0 and Y0, X1 and Y1, X2 and Y2, and X3 and Y3) in each operand.