User's Manual

10 Group II OptimizationsSecondary Optimizations
AMD Athlon Processor x86 Code Optimization
22007E/0November 1999
Avoid Load-Execute Floating-Point Instructions with Integer Operands
Do not use load-execute floating-point instructions with integer
operands. The floating-point load-execute instructions with
integer operands are VectorPath and generate two OPs in a
cycle, while the discrete equivalent enables a third DirectPath
instruction to be decoded in the same cycle.
Take Advantage of Write Combining
This guideline applies only to operating system, device driver,
and BIOS programmers. In order to improve system
performance, the AMD Athlon processor aggressively combines
multiple memory-write cycles of any data size that address
locations within a 64-byte cache line aligned write buffer.
See Appendix C, Implementation of Write Combining on
page 155 for more details.
Use 3DNow! Instructions
Unless accuracy requirements dictate otherwise, perform
floating-point computations using the 3DNow! instructions
instead of x87 instructions. The SIMD nature of 3DNow!
instructions achieves twice the number of FLOPs that are
achieved through x87 instructions. 3DNow! instructions also
provide for a flat register file instead of the stack-based
approach of x87 instructions.
See Table 23 on page 217 for a list of 3DNow! instructions. For
information about instruction usage, see the 3DNow!™
Technology Manual, order# 21928.
Avoid Branches Dependent on Random Data
Avoid data-dependent branches around a single instruction.
Data-dependent branches acting upon basically random data
can cause the branch prediction logic to mispredict the branch
about 50% of the time. Design branch-free alternative code
sequences, which results in shorter average execution time.
See Avoid Branches Dependent on Random Data on page 57
for more details.
TOP
TOP
TOP
TOP