User's Manual

40 Code Padding Using Neutral Code Fillers
AMD Athlon Processor x86 Code Optimization
22007E/0November 1999
Recommendations for the AMD Athlon Processor
For code that is optimized specifically for the AMD Athlon
processor, the optimal code fillers are NOP instructions (opcode
0x90) with up to two REP prefixes (0xF3). In the AMD Athlon
processor, a NOP with up to two REP prefixes can be handled
by a single decoder with no overhead. As the REP prefixes are
redundant and meaningless, they get discarded, and NOPs are
handled without using any execution resources. The three
decoders of AMD Athlon processor can handle up to three
NOPs, each with up to two REP prefixes each, in a single cycle,
for a neutral code filler of up to nine bytes.
Note: When used as a filler instruction, REP/REPNE prefixes can
be used in conjunction only with NOPs. REP/REPNE has
undefined behavior when used with instructions other than
a NOP.
If a larger amount of code padding is required, it is
recommended to use a JMP instruction to jump across the
padding region. The following assembly language macros show
this:
NOP1_ATHLON TEXTEQU <DB 090h>
NOP2_ATHLON TEXTEQU <DB 0F3h, 090h>
NOP3_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h>
NOP4_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h, 090h>
NOP5_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h, 0F3h, 090h>
NOP6_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h, 0F3h, 0F3h, 090h>
NOP7_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h, 0F3h, 0F3h, 090h,
090h>
NOP8_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h, 0F3h, 0F3h, 090h,
0F3h, 090h>
NOP9_ATHLON TEXTEQU <DB 0F3h, 0F3h, 090h, 0F3h, 0F3h, 090h,
0F3h, 0F3h, 090h>
NOP10_ATHLONTEXTEQU <DB 0EBh, 008h, 90h, 90h, 90h, 90h,
90h, 90h, 90h, 90h>