Parallel Programming Guide for HP-UX Systems

ManualsBrandsHP ManualsSoftwareHP-UX Performance Tools

161

162

163

164

165

166

167

168

169

170

Parallel synchronization

Synchronizing code

Chapter 8150

#pragma _CNX loop_parallel(ivar=i)

for(i=0;i<n;i++) {

a[i] = b[i] + c[i];

#pragma _CNX critical_section(gate1)

absum = absum + a[i];

#pragma _CNX end_critical_section

if(adjb[i]) {

b[i] = c[i] + d[i];

#pragma _CNX critical_section(gate1)

absum = absum + b[i];

#pragma _CNX end_critical_section

}

lock = free_gate(&gate1);

The shared variable absum must be updated after a(I) is assigned and

again if b(i) is assigned. Access to absum must be guarded by the same

gate to ensure that two threads do not attempt to update it at once. The

critical sections protecting the assignment to ABSUM must explicitly name

this gate, or the compiler chooses unique gates for each section,

potentially resulting in incorrect answers.There must be a substantial

amount of parallelizable code outside of these critical sections to make

parallelizing this loop cost-effective.

Using ordered sections

Like critical sections, ordered sections lock and unlock a speciﬁed gate to

isolate a section of code in a loop. However, they also ensure that the

enclosed section of code executes in the same order as the iterations of

the ordered parallel loop that contains it.

Once a given thread passes through an ordered section, it cannot enter

again until all other threads have passed through in order. This ordering

is difﬁcult to implement without using the ordered section directives or

pragmas.

You must use a loop_parallel(ordered) directive or pragma to

parallelize any loop containing an ordered section. See

”loop_parallel(ordered)” on page 144 for a description of this.

Example 8-6 Ordered sections

The following Fortran example contains a backward loop-carried

dependence on the array A that would normally inhibit parallelization.