User's Manual

54 Stack Alignment Considerations
AMD Athlon Processor x86 Code Optimization
22007E/0November 1999
One Supported Store-
to-Load Forwarding
Case
There is one case of a mismatched store-to-load forwarding that
is supported by the by AMD Athlon processor. The lower 32 bits
from an aligned QWORD write feeding into a DWORD read is
allowed.
Example 8 (Allowed):
MOVQ [AlignedQword], mm0
...
MOV EAX, [AlignedQword]
Summary of Store-to-Load Forwarding Pitfalls to Avoid
To avoid store-to-load forwarding pitfalls, code should conform
to the following guidelines:
Maintain consistent use of operand size across all loads and
stores. Preferably, use doubleword or quadword operand
sizes.
Avoid misaligned data references.
Avoid narrow-to-wide and wide-to-narrow forwarding cases.
When using word or byte stores, avoid loading data from
anywhere in the same doubleword of memory other than the
identical start addresses of the stores.
Stack Alignment Considerations
Make sure the stack is suitably aligned for the local variable
with the largest base type. Then, using the technique described
in C Language Structure Component Considerations on page
55, all variables can be properly aligned with no padding.
Extend to 32 Bits
Before Pushing onto
Stack
Function arguments smaller than 32 bits should be extended to
32 bits before being pushed onto the stack, which ensures that
the stack is always doubleword aligned on entry to a function.
If a function has no local variables with a base type larger than
doubleword, no further work is necessary. If the function does
have local variables whose base type is larger than a
doubleword, additional code should be inserted to ensure
proper alignment of the stack. For example, the following code
achieves quadword alignment: