Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture

10-12 Vol. 1
PROGRAMMING WITH STREAMING SIMD EXTENSIONS (SSE)
tively, the low single-precision floating-point values of two operands and store the
result in the low doubleword of the destination operand.
The MULPS (multiply packed single-precision floating-point values) instruction multi-
plies two packed single-precision floating-point operands.
The MULSS (multiply scalar single-precision floating-point values) instruction multi-
plies the low single-precision floating-point values of two operands and stores the
result in the low doubleword of the destination operand.
The DIVPS (divide packed, single-precision floating-point values) instruction divides
two packed single-precision floating-point operands.
The DIVSS (divide scalar single-precision floating-point values) instruction divides
the low single-precision floating-point values of two operands and stores the result in
the low doubleword of the destination operand.
The RCPPS (compute reciprocals of packed single-precision floating-point values)
instruction computes the approximate reciprocals of values in a packed single-preci-
sion floating-point operand.
The RCPSS (compute reciprocal of scalar single-precision floating-point values)
instruction computes the approximate reciprocal of the low single-precision floating-
point value in the source operand and stores the result in the low doubleword of the
destination operand.
The SQRTPS (compute square roots of packed single-precision floating-point values)
instruction computes the square roots of the values in a packed single-precision
floating-point operand.
The SQRTSS (compute square root of scalar single-precision floating-point values)
instruction computes the square root of the low single-precision floating-point value
in the source operand and stores the result in the low doubleword of the destination
operand.
The RSQRTPS (compute reciprocals of square roots of packed single-precision
floating-point values) instruction computes the approximate reciprocals of the
square roots of the values in a packed single-precision floating-point operand.
The RSQRTSS (reciprocal of square root of scalar single-precision floating-point
value) instruction computes the approximate reciprocal of the square root of the low
single-precision floating-point value in the source operand and stores the result in
the low doubleword of the destination operand.
The MAXPS (return maximum of packed single-precision floating-point values)
instruction compares the corresponding values from two packed single-precision
floating-point operands and returns the numerically greater value from each compar-
ison to the destination operand.
The MAXSS (return maximum of scalar single-precision floating-point values)
instruction compares the low values from two packed single-precision floating-point
operands and returns the numerically greater value from the comparison to the low
doubleword of the destination operand.