Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture

Vol. 1 E-5
GUIDELINES FOR WRITING SIMD FLOATING-POINT EXCEPTION HANDLERS
operands into up to four sets of sub-operands, and will submit them one set at a time
to an emulation function (See Example E-1 in Section E.4.3, “Example SIMD
Floating-Point Emulation Implementation”). The emulation function will examine the
sub-operands, and will possibly redo the necessary calculation.
Two cases are possible:
If an unmasked (enabled) exception would occur in this process, the emulation
function will return to its caller (the filter function) with the appropriate infor-
mation. The filter will invoke a (previously registered) user floating-point
exception handler for this set of sub-operands, and will record the result upon
return from the user handler (provided the user handler allows continuation of
the execution).
If no unmasked (enabled) exception would occur, the emulation function will
determine and will return to its caller the result of the operation for the current
set of sub-operands (it has to be IEEE Standard 754 compliant). The filter
function will record the result (plus any new flag settings).
The user level filter function will then call the emulation function for the next set of
sub-operands (if any). When done with all the operand sets, the partial results will be
packed (if the excepting instruction has a packed floating-point result, which is true
for most SSE/SSE2/SSE3 numeric instructions) and the filter will return to the low-
level exception handler, which in turn will return from the interruption, allowing
execution to continue. Note that the instruction pointer (EIP) has to be altered to
point to the instruction following the excepting instruction, in order to continue
execution correctly.
If a user mode floating-point exception filter is not provided, then all the work for
decoding the excepting instruction, reading its operands, emulating the instruction
for the components of the result that do not correspond to unmasked floating-point
exceptions, and providing the compounded result will have to be performed by the
user-provided floating-point exception handler.
Actual emulation might have to take place for one operand or pair of operands for
scalar operations, and for all sub-operands or pairs of sub-operands for packed oper-
ations. The steps to perform are the following:
The excepting instruction has to be decoded and the operands have to be read
from the saved context.
The instruction has to be emulated for each (pair of) sub-operand(s); if no
floating-point exception occurs, the partial result has to be saved; if a masked
floating-point exception occurs, the masked result has to be produced through
emulation and saved, and the appropriate status flags have to be set; if an
unmasked floating-point exception occurs, the result has to be generated by the
user provided floating-point exception handler, and the appropriate status flags
have to be set.
The partial results have to be combined and written to the context that will be
restored upon application program resumption.