User's Manual

98 Use FFREEP Macro to Pop One Register from the FPU
AMD Athlon Processor x86 Code Optimization
22007E/0November 1999
Use FFREEP Macro to Pop One Register from the FPU Stack
In FPU intensive code, frequently accessed data is often
pre-loaded at the bottom of the FPU stack before processing
floating-point data. After completion of processing, it is
desirable to remove the pre-loaded data from the FPU stack as
quickly as possible. The classical way to clean up the FPU stack
is to use either of the following instructions:
FSTP ST(0) ;removes one register from stack
FCOMPP ;removes two registers from stack
On the AMD Athlon processor, a faster alternative is to use the
FFREEP instruction below. Note that the FFREEP instruction,
although insufficiently documented in the past, is supported by
all 32-bit x86 processors. The opcode bytes for FFREEP ST(i)
are listed in Table 22 on page 212.
FFREEP ST(0) ;removes one register from stack
FFREEP ST(i) works like FFREE ST(i) except that it
increments the FPU top-of-stack after doing the FFREE work.
In other words, FFREEP ST(i) marks ST(i) as empty, then
increments the x87 stack pointer. On the AMD Athlon
processor, the FFREEP instruction converts to an internal NOP,
which can go down any pipe with no dependencies.
Many assemblers do not support the FFREEP instruction. In
these cases, a simple text macro can be created to facilitate use
of the FFREEP ST(0).
FFREEP_ST0 TEXTEQU <DB 0DFh, 0C0h>
Floating-Point Compare Instructions
For branches that are dependent on floating-point comparisons,
use the following instructions:
FCOMI
FCOMIP
FUCOMI
FUCOMIP