user manual

Dynamic Memory Allocation Consideration 25
22007E/0November 1999 AMD Athlon Processor x86 Code Optimization
which might inhibit certain optimizations with some
compilersfor example, aggressive inlining.
Dynamic Memory Allocation Consideration
Dynamic memory allocation (malloc in C language) should
always return a pointer that is suitably aligned for the largest
base type (quadword alignment). Where this aligned pointer
cannot be guaranteed, use the technique shown in the following
code to make the pointer quadword aligned, if needed. This
code assumes the pointer can be cast to a long.
Example:
double* p;
double* np;
p = (double *)malloc(sizeof(double)*number_of_doubles+7L);
np = (double *)((((long)(p))+7L) & (–8L));
Then use np instead of p to access the data. p is still needed
in order to deallocate the storage.
Introduce Explicit Parallelism into Code
Where possible, long dependency chains should be broken into
several independent dependency chains which can then be
executed in parallel exploiting the pipeline execution units.
This is especially important for floating-point code, whether it
is mapped to x87 or 3DNow! instructions because of the longer
latency of floating-point operations. Since most languages,
including ANSI C, guarantee that floating-point expressions are
not re-ordered, compilers can not usually perform such
optimizations unless they offer a switch to allow ANSI non-
compliant reordering of floating-point expressions according to
algebraic rules.
Note that re-ordered code that is algebraically identical to the
original code does not necessarily deliver identical
computational results due to the lack of associativity of floating
point operations. There are well-known numerical
considerations in applying these optimizations (consult a book
on numerical analysis). In some cases, these optimizations may