Parallel Programming Guide for HP-UX Systems

Troubleshooting
Misused directives and pragmas
Chapter 9190
The apparent dependence is removed, and both loops are optimized.
Nondeterminism of parallel execution
In a parallel program, threads do not execute in a predictable or
determined order. If you force the compiler to parallelize a loop when a
dependence exists, the results are unpredictable and can vary from one
execution to the next.
Consider the following Fortran code:
DO I = 1, N-1
A(I) = A(I+1) * B(I)
.
.
.
ENDDO
The compiler does not parallelize this code as written because of the
dependence on A(I). This dependence requires that the original value of
A(I+1) be available for the computation of A(I).
If this code was parallelized, some values of A would be assigned by some
processors before they were used by others, resulting in incorrect
assignments.
Because the results depend on the order in which statements execute,
the errors are nondeterministic. The loop must therefore execute in
iteration order to ensure that all values of A are computed correctly.
Loops containing dependences can sometimes be manually parallelized
using the LOOP_PARALLEL(ORDERED) directive. Unless you are sure that
no loop-carried dependence exists, it is safest to let the compiler choose
which loops to parallelize.