HP C Programmer's Guide (92434-90009)

110 Chapter4
Optimizing HP C Programs
Parallel Execution
Calling Routines with Side Effects The compiler will not parallelize any loop
containing a call to a routine that has side effects. A routine has side effects if it does any
of the following:
Modifies its arguments
Modifies an extern or static variable
Redefines variables that are local to the calling routine
Performs I/O
Calls another subroutine or function that does any of the above
Indeterminate Iteration Counts If the compiler determines that a runtime
determination of a loop's iteration count cannot be made before the loop starts to execute,
the compiler will not parallelize the loop. The reason for this precaution is that the
runtime code must know the iteration count in order to know how many iterations to
distribute to the different processors for execution.
The following conditions can prevent a runtime count:
The loop is an infinite loop.
A conditional break statement or goto out of the loop appears in the loop.
The loop modifies either the loop-control or loop-limit variable.
The loop is a while construct and the condition being tested is defined within the loop.
Data Dependence When a loop is parallelized, the iterations are executed
independently on different processors, and the order of execution will differ from the serial
order that occurs on a single processor. This effect of parallelization is not a problem. The
iterations could be executed in any order with no effect on the results. Consider the
following loop:
for (i=0; i<5; i++)
a[i] = a[i] * b[i]
In this example, the array a would always end up with the same data regardless of
whether the order of execution were 0-1-2-3-4, 4-3-2-1-0, 3-1-4-0-2, or any other order. The
independence of each iteration from the others makes the loop eligible candidate for
parallelization.
Such is not the case in the following:
for (i=1; i<5; i++)
a[i] = a[i-1] * b[i]
In this loop, the order of execution does matter. The data used in iteration i is
dependent
upon the data that was produced in the previous iteration [i-1]. a would end up with very
different data if the order of execution were any other than 1-2-3-4. The data dependence
in this loop thus makes it ineligible for parallelization.
Not all data dependences must inhibit parallelization. The following paragraphs discuss
some of the exceptions.
Nested Loops and Matrices Some nested loops that operate on matrices may have a