Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture
10-16 Vol. 1
PROGRAMMING WITH STREAMING SIMD EXTENSIONS (SSE)
The CVTSI2SS (convert doubleword integer to scalar single-precision floating-point
value) instruction converts a signed doubleword integer into a single-precision
floating-point value. When the conversion is inexact, the result is rounded according
to the rounding mode selected in the MXCSR register.
The CVTPS2PI (convert packed single-precision floating-point values to packed
doubleword integers) instruction converts two packed single-precision floating-point
values into two packed signed doubleword integers. When the conversion is inexact,
the result is rounded according to the rounding mode selected in the MXCSR register.
The CVTTPS2PI (convert with truncation packed single-precision floating-point
values to packed doubleword integers) instruction is similar to the CVTPS2PI instruc-
tion, except that truncation is used to round a source value to an integer value (see
Section 4.8.4.2, “Truncation with SSE and SSE2 Conversion Instructions”).
The CVTSS2SI (convert scalar single-precision floating-point value to doubleword
integer) instruction converts a single-precision floating-point value into a signed
doubleword integer. When the conversion is inexact, the result is rounded according
to the rounding mode selected in the MXCSR register. The CVTTSS2SI (convert with
truncation scalar single-precision floating-point value to doubleword integer) instruc-
tion is similar to the CVTSS2SI instruction, except that truncation is used to round
the source value to an integer value (see Section 4.8.4.2, “Truncation with SSE and
SSE2 Conversion Instructions”).
10.4.4 SSE 64-Bit SIMD Integer Instructions
SSE extensions add the following 64-bit packed integer instructions to the IA-32
architecture. These instructions operate on data in MMX registers and 64-bit memory
locations.
NOTE
When SSE2 extensions are present in an IA-32 processor, these
instructions are extended to operate on 128-bit operands in XMM
registers and 128-bit memory locations.
The PAVGB (compute average of packed unsigned byte integers) and PAVGW
(compute average of packed unsigned word integers) instructions compute a SIMD
average of two packed unsigned byte or word integer operands, respectively. For
each corresponding pair of data elements in the packed source operands, the
elements are added together, a 1 is added to the temporary sum, and that result is
shifted right one bit position.
The PEXTRW (extract word) instruction copies a selected word from an MMX register
into a general-purpose register.
The PINSRW (insert word) instruction copies a word from a general-purpose register
or from memory into a selected word location in an MMX register.