user manual

38 Replace Certain SHLD Instructions with Alternative
AMD Athlon Processor x86 Code Optimization
22007E/0November 1999
Replace Certain SHLD Instructions with Alternative Code
Certain instances of the SHLD instruction can be replaced by
alternative code using SHR and LEA. The alternative code has
lower latency and requires less execution resources. SHR and
LEA (32-bit version) are DirectPath instructions, while SHLD is
a VectorPath instruction. SHR and LEA preserves decode
bandwidth as it potentially enables the decoding of a third
DirectPath instruction.
Example 1 (Avoid):
SHLD REG1, REG2, 1
(Preferred):
SHR REG2, 31
LEA REG1, [REG1*2 + REG2]
Example 2 (Avoid):
SHLD REG1, REG2, 2
(Preferred):
SHR REG2, 30
LEA REG1, [REG1*4 + REG2]
Example 3 (Avoid):
SHLD REG1, REG2, 3
(Preferred):
SHR REG2, 29
LEA REG1, [REG1*8 + REG2]
Use 8-Bit Sign-Extended Immediates
Using 8-bit sign-extended immediates improves code density
with no negative effects on the AMD Athlon processor. For
example, ADD BX, 5 should be encoded 83 C3 FB and not
81 C3 FF FB.