User's Manual

30 Accelerating Floating-Point Divides and Square Roots
AMD Athlon Processor x86 Code Optimization
22007E/0November 1999
necessary for the currently selected precision. This means that
setting precision control to single precision (versus Win32
default of double precision) lowers the latency of those
operations.
The Microsoft
®
Visual C environment provides functions to
manipulate the FPU control word and thus the precision
control. Note that these functions are not very fast, so changes
of precision control should be inserted where it creates little
overhead, such as outside a computation-intensive loop.
Otherwise the overhead created by the function calls outweighs
the benefit from reducing the latencies of divide and square
root operations.
The following example shows how to set the precision control to
single precision and later restore the original settings in the
Microsoft Visual C environment.
Example:
/* prototype for _controlfp() function */
#include <float.h>
unsigned int orig_cw;
/* Get current FPU control word and save it */
orig_cw = _controlfp (0,0);
/* Set precision control in FPU control word to single
precision. This reduces the latency of divide and square
root operations.
*/
_controlfp (_PC_24, MCW_PC);
/* restore original FPU control word */
_controlfp (orig_cw, 0xfffff);