user manual

22007E/0November 1999 AMD Athlon Processor x86 Code Optimization
Overview 33
4
Instruction Decoding
Optimizations
This chapter discusses ways to maximize the number of
instructions decoded by the instruction decoders in the
AMD Athlon processor. Guidelines are listed in order of
importance.
Overview
The AMD Athlon processor instruction fetcher reads 16-byte
aligned code windows from the instruction cache. The
instruction bytes are then merged into a 24-byte instruction
queue. On each cycle, the in-order front-end engine selects for
decode up to three x86 instructions from the instruction-byte
queue.
All instructions (x86, x87, 3DNow!, and MMX) are
classified into two types of decodes DirectPath and
VectorPath (see DirectPath Decoder and VectorPath
Decoder on page 133 for more information). DirectPath
instructions are common instructions that are decoded directly
in hardware. VectorPath instructions are more complex
instructions that require the use of a sequence of multiple
operations issued from an on-chip ROM.
Up to three DirectPath instructions can be selected for decode
per cycle. Only one VectorPath instruction can be selected for
decode per cycle. DirectPath instructions and VectorPath
instructions cannot be simultaneously decoded.