Parallel Programming Guide for HP-UX Systems

Troubleshooting

Floating-point imprecision

Chapter 9 183

Floating-point imprecision

The compiler applies normal arithmetic rules to real numbers. It

assumes that two arithmetically equivalent expressions produce the

same numerical result.

Most real numbers cannot be represented exactly in digital computers.

Instead, these numbers are rounded to a ﬂoating-point value that is

represented. When optimization changes the evaluation order of a

ﬂoating-point expression, the results can change. Possible consequences

of ﬂoating-point roundoff include program aborts, division by zero,

address errors, and incorrect results.

In any parallel program, the execution order of the instructions differs

from the serial version of the same program. This can cause noticeable

roundoff differences between the two versions. Running a parallel code

under different machine conﬁgurations or conditions can also yield

roundoff differences, because the execution order can differ under

differing machine conditions, causing roundoff errors to propagate in

different orders between executions. Accumulator variables (reductions)

are especially susceptible to these problems.

Consider the following Fortran example:

C$DIR GATE(ACCUM_LOCK)

LK = ALLOC_GATE(ACCUM_LOCK)

LK = UNLOCK_GATE(ACCUM_LOCK)

C$DIR BEGIN_TASKS, TASK_PRIVATE(I)

CALL COMPUTE(A)

C$DIR CRITICAL_SECTION(ACCUM_LOCK)

ACCUM = ACCUM + A

C$DIR END_CRITICAL_SECTION

C$DIR NEXT_TASK

DO I = 1, 10000

B(I) = FUNC(I)

C$DIR CRITICAL_SECTION(ACCUM_LOCK)

ACCUM = ACCUM + B(I)

C$DIR END_CRITICAL_SECTION