Technical data

A
Algorithm Optimization Examples
This appendix illustrates how the characteristics of the VAX 6000
series vector processor can be used to build optimized routines for this
system and how the algorithm and its implementation can change the
performance of an application on the VAX 6000 processor.
The VAX 6000 series vector processor delivers high performance for
computationally intensive applications. Based on CMOS technology, the
VAX 6000 Model 400 vector processor is capable of operating at peak
speeds of 90 MFLOPs single precision and 45 MFLOPs double precision.
Linear algebra and signal processing applications that utilize the
various hardware features have demonstrated vector speedups between
3 and 35 over the scalar VAX 6000 CPU times. With the integrated
vector processing available on the VAX 6000 series, the performance
of computationally intensive applications may now approach that of
supercomputers.
Algorithm changes can alter the data access patterns to more efficiently
use the memory subsystem, can increase the average vector length,
and can minimize the number of vector operations required. By
applying Amdahl’s Law of vectorization, performance can be improved
by increasing the percentage of code that is vectorized.
Four basic optimization methods that take advantage of the processing
power of VAX 6000 series system include:
Rearrange code for maximum vectorization of the inner loop and
remove data dependencies within the loop
Vectorize across contiguous memory locations to produce unity stride
vectors for increased cache hit rates and optimized cache miss
handling
Reuse the data already loaded into the vector registers as frequently
as possible to reduce the number of vector load and store operations
Maximize instruction execution overlap by pairing arithmetic
instructions between load and store instructions wherever possible
Further information on optimization techniques in FORTRAN can be
found in the VAX FORTRAN Performance Guide available with the
FORTRAN-High Performance Option.
A–1