Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture

Vol. 1 11-11
PROGRAMMING WITH STREAMING SIMD EXTENSIONS 2 (SSE2)
the two packed double-precision floating-point values from source operand in the
high quadword of the destination operand (see Figure 11-5). By using the same
register for the source and destination operands, the SHUFPD instruction can swap
two packed double-precision floating-point values.
The UNPCKHPD (unpack and interleave high packed double-precision floating-point
values) instruction performs an interleaved unpack of the high values from the
source and destination operands and stores the result in the destination operand
(see Figure 11-6).
The UNPCKLPD (unpack and interleave low packed double-precision floating-point
values) instruction performs an interleaved unpack of the low values from the source
and destination operands and stores the result in the destination operand (see
Figure 11-7).
Figure 11-5. SHUFPD Instruction, Packed Shuffle Operation
Figure 11-6. UNPCKHPD Instruction, High Unpack and Interleave Operation
X1 X0
Y1 Y0
Y1 or Y0
X1 or X0
DEST
SRC
DEST
X1
X0
Y1 Y0
Y1
X1
DEST
SRC
DEST