user manual
4 AMD Athlon™ Processor Microarchitecture Summary
AMD Athlon™ Processor x86 Code Optimization
22007E/0—November 1999
AMD Athlon™ Processor Microarchitecture Summary
The AMD Athlon processor brings superscalar performance
and high operating frequency to PC systems running
industry-standard x86 software. A brief summary of the
next-generation design features implemented in the
AMD Athlon processor is as follows:
■ High-speed double-rate local bus interface
■ Large, split 128-Kbyte level-one (L1) cache
■ Dedicated backside level-two (L2) cache
■ Instruction predecode and branch detection during cache
line fills
■ Decoupled decode/execution core
■ Three-way x86 instruction decoding
■ Dynamic scheduling and speculative execution
■ Three-way integer execution
■ Three-way address generation
■ Three-way floating-point execution
■ 3DNow!™ technology and MMX™ single-instruction
multiple-data (SIMD) instruction extensions
■ Super data forwarding
■ Deep out-of-order integer and floating-point execution
■ Register renaming
■ Dynamic branch prediction
The AMD Athlon processor communicates through a
next-generation high-speed local bus that is beyond the current
Socket 7 or Super7™ bus standard. The local bus can transfer
data at twice the rate of the bus operating frequency by using
both the rising and falling edges of the clock (see
“AMD Athlon™ System Bus” on page 139 for more
information).
To reduce on-chip cache miss penalties and to avoid subsequent
data load or instruction fetch stalls, the AMD Athlon processor
has a dedicated high-speed backside L2 cache. The large
128-Kbyte L1 on-chip cache and the backside L2 cache allow the