Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture

Vol. 1 11-27
PROGRAMMING WITH STREAMING SIMD EXTENSIONS 2 (SSE2)
An application that expects to detect x87 FPU exceptions that occur during the
execution of x87 FPU instructions will not be notified if exceptions occurs during
the execution of corresponding SSE/SSE2/SSE3
1
instructions, unless the
exception masks that are enabled in the x87 FPU control word have also been
enabled in the MXCSR register and the application is capable of handling SIMD
floating-point exceptions (#XF).
Masked exceptions that occur during an SSE/SSE2/SSE3 library call cannot
be detected by unmasking the exceptions after the call (in an attempt to
generate the fault based on the fact that an exception flag is set). A SIMD
floating-point exception flag that is set when the corresponding exception is
unmasked will not generate a fault; only the next occurrence of that
unmasked exception will generate a fault.
An application which checks the x87 FPU status word to determine if any
masked exception flags were set during an x87 FPU library call will also need
to check the MXCSR register to detect a similar occurrence of a masked
exception flag being set during an SSE/SSE2/SSE3 library call.
11.6 WRITING APPLICATIONS WITH SSE/SSE2
EXTENSIONS
The following sections give some guidelines for writing application programs and
operating-system code that uses the SSE and SSE2 extensions. Because SSE and
SSE2 extensions share the same state and perform companion operations, these
guidelines apply to both sets of extensions.
Chapter 12 in the Intel® 64 and IA-32 Architectures Software Developer’s Manual,
Volume 3A, discusses the interface to the processor for context switching as well as
other operating system considerations when writing code that uses SSE/SSE2/SSE3
extensions.
11.6.1 General Guidelines for Using SSE/SSE2 Extensions
The following guidelines describe how to take full advantage of the performance
gains available with the SSE and SSE2 extensions:
Ensure that the processor supports the SSE and SSE2 extensions.
Ensure that your operating system supports the SSE and SSE2 extensions.
(Operating system support for the SSE extensions implies support for SSE2
extension and vice versa.)
1. SSE3 refers to ADDSUBPD, ADDSUBPS, HADDPD, HADDPS, HSUBPD and HSUBPS; the only other
SSE3 instruction that can raise floating-point exceptions is FISTTP: it can generate x87 FPU
invalid operation and inexact result exceptions.