Intel 64 and IA-32 Architectures Software Developers Manual Volume 1, Basic Architecture

ManualsBrandsIntel ManualsOtherIntel Pentium 4 Processor 2.80 GHz, 512K Cache, 533 MHz FSB

431

432

433

434

435

436

437

438

439

440

E-22 Vol. 1

GUIDELINES FOR WRITING SIMD FLOATING-POINT EXCEPTION HANDLERS

The arithmetic operations exemplified are emulated as follows:

1. If the denormals-are-zeros mode is enabled (the DAZ bit in MXCSR is set to 1),

replace all the denormal inputs with zeroes of the same sign (the denormal flag

is not affected by this change).

2. Perform the operation using x87 FPU instructions, with exceptions disabled, the

original user rounding mode, and single precision. This reveals invalid, denormal,

or divide-by-zero exceptions (if there are any) and stores the result in memory as

a double precision value (whose exponent range is large enough to look like

“unbounded” to the result of the single precision computation).

3. If no unmasked exceptions were detected, determine if the result is less than the

smallest normal number (tiny) that can be represented in single precision

format, or greater than the largest normal number that can be represented in

single precision format (huge). If an unmasked overflow or underflow occurs,

calculate the scaled result that will be handed to the user exception handler, as

specified by IEEE Standard 754.

4. If no exception was raised, calculate the result with a “bounded” exponent. If the

result is tiny, it requires denormalization (shifting the significand right while

incrementing the exponent to bring it into the admissible range of [-126,+127]

for single precision floating-point numbers).

The result obtained in step 2 cannot be used because it might incur a double

rounding error (it was rounded to 24 bits in step 2, and might have to be rounded

again in the denormalization process). To overcome this is, calculate the result as

a double precision value, and store it to memory in single precision format.

Rounding first to 53 bits in the significand, and then to 24 never causes a double

rounding error (exact properties exist that state when double-rounding error

occurs, but for the elementary arithmetic operations, the rule of thumb is that if

an infinitely precise result is rounded to 2p+1 bits and then again to p bits, the

result is the same as when rounding directly to p bits, which means that no

double-rounding error occurs).

5. If the result is inexact and the inexact exceptions are unmasked, the calculated

result will be delivered to the user floating-point exception handler.

6. The flush-to-zero case is dealt with if the result is tiny.

7. The emulation function returns RAISE_EXCEPTION to the filter function if an

exception has to be raised (the exception_cause field indicates the cause).

Otherwise, the emulation function returns DO_NOT_ RAISE_EXCEPTION. In the

first case, the result is provided by the user exception handler called by the filter

function. In the second case, it is provided by the emulation function. The filter

function has to collect all the partial results, and to assemble the scalar or packed

result that is used if execution is to continue.