Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture

Vol. 1 12-5
PROGRAMMING WITH SSE3 AND SUPPLEMENTAL SSE3
OperandB (128 bits, four data elements): 3
b
, 2
b
, 1
b
, 0
b
Result (stored in OperandA): 2
b
, 2
b
, 0
b
, 0
b
The MOVDDUP instruction loads/moves 64-bits; duplicating the 64 bits from the
source.
MOVDDUP OperandA, OperandB
OperandA (128 bits, two data elements): 1
a
, 0
a
OperandB (64 bits, one data element): 0
b
Result (stored in OperandA): 0
b
, 0
b
12.3.4 SIMD Floating-Point Instructions Provide Packed
Addition/Subtraction
The ADDSUBPS instruction has two 128-bit operands. The instruction performs
single-precision addition on the second and fourth pairs of 32-bit data elements
within the operands; and single-precision subtraction on the first and third pairs.
ADDSUBPS OperandA, OperandB
OperandA (128 bits, four data elements): 3
a
, 2
a
, 1
a
, 0
a
OperandB (128 bits, four data elements): 3
b
, 2
b
, 1
b
, 0
b
Result (stored in OperandA): 3
a
+3
b
, 2
a
-2
b
, 1
a
+1
b
, 0
a
-0
b
The ADDSUBPD instruction has two 128-bit operands. The instruction performs
double-precision addition on the second pair of quadwords, and double-precision
subtraction on the first pair.
ADDSUBPD OperandA, OperandB
OperandA (128 bits, two data elements): 1
a
, 0
a
OperandB (128 bits, two data elements): 1
b
, 0
b
Result (stored in OperandA): 1
a
+1
b
, 0
a
-0
b
12.3.5 SIMD Floating-Point Instructions Provide Horizontal
Addition/Subtraction
Most SIMD instructions operate vertically. This means that the result in position i is a
function of the elements in position i of both operands. Horizontal addition/subtrac-
tion operates horizontally. This means that contiguous data elements in the same
source operand are used to produce a result.
The HADDPS instruction performs a single-precision addition on contiguous data
elements. The first data element of the result is obtained by adding the first and
second elements of the first operand; the second element by adding the third and
fourth elements of the first operand; the third by adding the first and second