Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture

12-6 Vol. 1
PROGRAMMING WITH SSE3 AND SUPPLEMENTAL SSE3
elements of the second operand; and the fourth by adding the third and fourth
elements of the second operand.
HADDPS OperandA, OperandB
OperandA (128 bits, four data elements): 3
a
, 2
a
, 1
a
, 0
a
OperandB (128 bits, four data elements): 3
b
, 2
b
, 1
b
, 0
b
Result (Stored in OperandA): 3
b
+2
b
, 1
b
+0
b
, 3
a
+2
a
, 1
a
+0
a
The HSUBPS instruction performs a single-precision subtraction on contiguous data
elements. The first data element of the result is obtained by subtracting the second
element of the first operand from the first element of the first operand; the second
element by subtracting the fourth element of the first operand from the third element
of the first operand; the third by subtracting the second element of the second
operand from the first element of the second operand; and the fourth by subtracting
the fourth element of the second operand from the third element of the second
operand.
HSUBPS OperandA, OperandB
OperandA (128 bits, four data elements): 3
a
, 2
a,
1
a
, 0
a
OperandB (128 bits, four data elements): 3
b
, 2
b
, 1
b
, 0
b
Result (Stored in OperandA): 2
b
-3
b
, 0
b
-1
b
, 2
a
-3
a
, 0
a
-1
a
The HADDPD instruction performs a double-precision addition on contiguous data
elements. The first data element of the result is obtained by adding the first and
second elements of the first operand; the second element by adding the first and
second elements of the second operand.
HADDPD OperandA, OperandB
OperandA (128 bits, two data elements): 1
a
, 0
a
OperandB (128 bits, two data elements): 1
b
, 0
b
Result (Stored in OperandA): 1
b
+0
b
, 1
a
+0
a
The HSUBPD instruction performs a double-precision subtraction on contiguous data
elements. The first data element of the result is obtained by subtracting the second
element of the first operand from the first element of the first operand; the second
element by subtracting the second element of the second operand from the first
element of the second operand.
HSUBPD OperandA OperandB
OperandA (128 bits, two data elements): 1
a
, 0
a
OperandB (128 bits, two data elements): 1
b
, 0
b
Result (Stored in OperandA): 0
b
-1
b
, 0
a
-1
a