User's Manual

100 Minimize Floating-Point-to-Integer Conversions
AMD Athlon Processor x86 Code Optimization
22007E/0November 1999
Minimize Floating-Point-to-Integer Conversions
C++, C, and Fortran define floating-point-to-integer conversions
as truncating. This creates a problem because the active
rounding mode in an application is typically round-to-nearest-
even. The classical way to do a double-to-int conversion
therefore works as follows:
Example 1 (Fast):
SUB [I], EDX ;trunc(X)=rndint(X)-correction
FLD QWORD PTR [X] ;load double to be converted
FSTCW [SAVE_CW] ;save current FPU control word
MOVZX EAX, WORD PTR[SAVE_CW];retrieve control word
OR EAX, 0C00h ;rounding control field = truncate
MOV WORD PTR [NEW_CW], AX ;new FPU control word
FLDCW [NEW_CW] ;load new FPU control word
FISTP DWORD PTR [I] ;do double->int conversion
FLDCW [SAVE_CW] ;restore original control word
The AMD Athlon processor contains special acceleration
hardware to execute such code as quickly as possible. In most
situations, the above code is therefore the fastest way to
perform floating-point-to-integer conversion and the conversion
is compliant both with programming language standards and
the IEEE-754 standard.
According to the recommendations for inlining (see Always
Inline Functions with Fewer than 25 Machine Instructions on
page 72), the above code should not be put into a separate
subroutine (e.g., ftol). It should rather be inlined into the main
code.
In some codes, floating-point numbers are converted to an
integer and the result is immediately converted back to
floating-point. In such cases, the FRNDINT instruction should
be used for maximum performance instead of FISTP in the code
above. FRNDINT delivers the integral result directly to an FPU
register in floating-point form, which is faster than first using
FISTP to store the integer result and then converting it back to
floating-point with FILD.
If there are multiple, consecutive floating-point-to-integer
conversions, the cost of FLDCW operations should be
minimized by saving the current FPU control word, forcing the