Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture
5-28 Vol. 1
INSTRUCTION SET SUMMARY
5.8 SUPPLEMENTAL STREAMING SIMD EXTENSIONS 3
(SSSE3) INSTRUCTIONS
SSSE3 provide 32 instructions (represented by 14 mnemonics) to accelerate compu-
tations on packed integers. These include:
• Twelve instructions that perform horizontal addition or subtraction operations.
• Six instructions that evaluate absolute values.
• Two instructions that perform multiply and add operations and speed up the
evaluation of dot products.
• Two instructions that accelerate packed-integer multiply operations and produce
integer values with scaling.
• Two instructions that perform a byte-wise, in-place shuffle according to the
second shuffle control operand.
• Six instructions that negate packed integers in the destination operand if the
signs of the corresponding element in the source operand is less than zero.
• Two instructions that align data from the composite of two operands.
SSSE3 instructions can only be executed on Intel 64 and IA-32 processors that
support SSSE3 extensions. Support for these instructions can be detected with the
CPUID instruction. See the description of the CPUID instruction in Chapter 3,
“Instruction Set Reference, A-M,” of the Intel® 64 and IA-32 Architectures Software
Developer’s Manual, Volume 2A.
The sections that follow describe each subgroup.
5.8.1 Horizontal Addition/Subtraction
PHADDW Adds two adjacent, signed 16-bit integers horizontally from the
source and destination operands and packs the signed 16-bit
results to the destination operand.
PHADDSW Adds two adjacent, signed 16-bit integers horizontally from the
source and destination operands and packs the signed, satu-
rated 16-bit results to the destination operand.
PHADDD Adds two adjacent, signed 32-bit integers horizontally from the
source and destination operands and packs the signed 32-bit
results to the destination operand.
PHSUBW Performs horizontal subtraction on each adjacent pair of 16-bit
signed integers by subtracting the most significant word from
the least significant word of each pair in the source and destina-
tion operands. The signed 16-bit results are packed and written
to the destination operand.
PHSUBSW Performs horizontal subtraction on each adjacent pair of 16-bit
signed integers by subtracting the most significant word from
the least significant word of each pair in the source and destina-