user manual

46 Use the 3DNow! PREFETCH and PREFETCHW
AMD Athlon Processor x86 Code Optimization
22007E/0November 1999
Align Data Where Possible
In general, avoid misaligned data references. All data whose
size is a power of 2 is considered aligned if it is naturally
aligned. For example:
QWORD accesses are aligned if they access an address
divisible by 8.
DWORD accesses are aligned if they access an address
divisible by 4.
WORD accesses are aligned if they access an address
divisible by 2.
TBYTE accesses are aligned if they access an address
divisible by 8.
A misaligned store or load operation suffers a minimum
one-cycle penalty in the AMD Athlon processor load/store
pipeline. In addition, using misaligned loads and stores
increases the likelihood of encountering a store-to-load
forwarding pitfall. For a more detailed discussion of store-to-
load forwarding issues, see Store-to-Load Forwarding
Restrictions on page 51.
Use the 3DNow! PREFETCH and PREFETCHW Instructions
For code that can take advantage of prefetching, use the
3DNow! PREFETCH and PREFETCHW instructions to
increase the effective bandwidth to the AMD Athlon processor.
The PREFETCH and PREFETCHW instructions take
advantage of the AMD Athlon processors high bus bandwidth
to hide long latencies when fetching data from system memory.
The prefetch instructions are essentially integer instructions
and can be used anywhere, in any type of code (integer, x87,
3DNow!, MMX, etc.).
Large data sets typically require unit-stride access to ensure
that all data pulled in by PREFETCH or PREFETCHW is
actually used. If necessary, algorithms or data structures should
be reorganized to allow unit-stride access.
TOP
TOP