Technical data
Algorithm Optimization Examples
Figure A–3 Optimized Cooley-Tukey Butterfly Graph, One-Dimensional Fast Fourier
Transform for N = 16
Refer to the printed version of this book, EK–60VAA–PG.
Reusing data in the vector registers also saves vector processing time.
The VAX vector architecture provides 16 vector registers. If all 16
registers are used carefully, data can be reused by two successive butterfly
stages without storing and reloading the data. With half the number of
loads and stores, the vector performance almost doubles.
A.2.2 Optimized Two-Dimensional Fast Fourier Transforms
The optimized one-dimensional FFT can be used to compute
multidimensional FFTs. Figure A–5 shows how an N by N two-
dimensional FFT can be computed by performing N one-dimensional
column FFTs and then N one-dimensional row FFTs. The same routine
can be called for column or row access FFTs by simply varying the stride
parameter that is passed to the routine. (Note: In FORTRAN, the column
A–9