Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture
Vol. 1 12-11
PROGRAMMING WITH SSE3 AND SUPPLEMENTAL SSE3
There are six horizontal add instructions (represented by three mnemonics); three
operate on 128-bit operands
and three operate on 64-bit operands. The width of each data element is either 16
bits or 32 bits. The mnemonics are listed below.
• PHADDW adds two adjacent, signed 16-bit integers horizontally from the source
and destination operands and packs the signed 16-bit results to the destination
operand.
• PHADDSW adds two adjacent, signed 16-bit integers horizontally from the source
and destination operands and packs the signed, saturated 16-bit results to the
destination operand.
• PHADDD adds two adjacent, signed 32-bit integers horizontally from the source
and destination operands and packs the signed 32-bit results to the destination
operand.
There are six horizontal subtract instructions (represented by three mnemonics);
three operate on 128-bit operands and three operate on 64-bit operands. The width
of each data element is either 16 bits or 32 bits. These are listed below.
• PHSUBW performs horizontal subtraction on each adjacent pair of 16-bit signed
integers by subtracting the most significant word from the least significant word
of each pair in the source and destination operands. The signed 16-bit results are
packed and written to the destination operand.
• PHSUBSW performs horizontal subtraction on each adjacent pair of 16-bit signed
integers by subtracting the most significant word from the least significant word
of each pair in the source and destination operands. The signed, saturated 16-bit
results are packed and written to the destination operand.
• PHSUBD performs horizontal subtraction on each adjacent pair of 32-bit signed
integers by subtracting the most significant doubleword from the least significant
double word of each pair in the source and destination operands. The signed
32-bit results are packed and written to the destination operand.
12.6.2 Packed Absolute Values
There are six packed-absolute-value instructions (represented by three mnemonics).
Three operate on 128-bit operands and three operate on 64-bit operands. The widths
of data elements are 8 bits, 16 bits or 32 bits. The absolute value of each data
element of the source operand is stored as an UNSIGNED result in the destination
operand.
• PABSB computes the absolute value of each signed byte data element.
• PABSW computes the absolute value of each signed 16-bit data element.
• PABSD computes the absolute value of each signed 32-bit data element.