Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture

11-12 Vol. 1
PROGRAMMING WITH STREAMING SIMD EXTENSIONS 2 (SSE2)
11.4.1.6 SSE2 Conversion Instructions
SSE2 conversion instructions (see Figure 11-8) support packed and scalar conver-
sions between:
Double-precision and single-precision floating-point formats
Double-precision floating-point and doubleword integer formats
Single-precision floating-point and doubleword integer formats
Conversion between double-precision and single-precision floating-points
values — The following instructions convert operands between double-precision and
single-precision floating-point formats. The operands being operated on are
contained in XMM registers or memory (at most, one operand can reside in memory;
the destination is always an MMX register).
The CVTPS2PD (convert packed single-precision floating-point values to packed
double-precision floating-point values) instruction converts two packed single-
precision floating-point values to two double-precision floating-point values.
The CVTPD2PS (convert packed double-precision floating-point values to packed
single-precision floating-point values) instruction converts two packed double-
precision floating-point values to two single-precision floating-point values. When a
conversion is inexact, the result is rounded according to the rounding mode selected
in the MXCSR register.
The CVTSS2SD (convert scalar single-precision floating-point value to scalar double-
precision floating-point value) instruction converts a single-precision floating-point
value to a double-precision floating-point value.
The CVTSD2SS (convert scalar double-precision floating-point value to scalar single-
precision floating-point value) instruction converts a double-precision floating-point
value to a single-precision floating-point value. When the conversion is inexact, the
result is rounded according to the rounding mode selected in the MXCSR register.
Figure 11-7. UNPCKLPD Instruction, Low Unpack and Interleave Operation
X1
X0
Y1 Y0
Y0
X0
DEST
SRC
DEST