Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture
Vol. 1 10-17
PROGRAMMING WITH STREAMING SIMD EXTENSIONS (SSE)
The PMAXUB (maximum of packed unsigned byte integers) instruction compares the
corresponding unsigned byte integers in two packed operands and returns the
greater of each comparison to the destination operand.
The PMINUB (minimum of packed unsigned byte integers) instruction compares the
corresponding unsigned byte integers in two packed operands and returns the lesser
of each comparison to the destination operand.
The PMAXSW (maximum of packed signed word integers) instruction compares the
corresponding signed word integers in two packed operands and returns the greater
of each comparison to the destination operand.
The PMINSW (minimum of packed signed word integers) instruction compares the
corresponding signed word integers in two packed operands and returns the lesser of
each comparison to the destination operand.
The PMOVMSKB (move byte mask) instruction creates an 8-bit mask from the packed
byte integers in an MMX register and stores the result in the low byte of a general-
purpose register. The mask contains the most significant bit of each byte in the MMX
register. (When operating on 128-bit operands, a 16-bit mask is created.)
The PMULHUW (multiply packed unsigned word integers and store high result)
instruction performs a SIMD unsigned multiply of the words in the two source oper-
ands and returns the high word of each result to an MMX register.
The PSADBW (compute sum of absolute differences) instruction computes the SIMD
absolute differences of the corresponding unsigned byte integers in two source oper-
ands, sums the differences, and stores the sum in the low word of the destination
operand.
The PSHUFW (shuffle packed word integers) instruction shuffles the words in the
source operand according to the order specified by an 8-bit immediate operand and
returns the result to the destination operand.
10.4.5 MXCSR State Management Instructions
The MXCSR state management instructions (LDMXCSR and STMXCSR) load and save
the state of the MXCSR register, respectively. The LDMXCSR instruction loads the
MXCSR register from memory, while the STMXCSR instruction stores the contents of
the register to memory.
10.4.6 Cacheability Control, Prefetch, and Memory Ordering
Instructions
SSE extensions introduce several new instructions to give programs more control
over the caching of data. They also introduces the PREFETCHh instructions, which
provide the ability to prefetch data to a specified cache level, and the SFENCE
instruction, which enforces program ordering on stores. These instructions are
described in the following sections.