Parallel Programming Guide for HP-UX Systems

Troubleshooting
Triangular loops
Chapter 9196
The scheme above maps a sequence of NTCHUNK-sized blocks over the F
array. Within each block, each thread owns a speciļ¬c cache line of data.
The relationship between data, threads, and blocks of size NTCHUNK is
shown in Figure 9-1 on page 196.
Figure 9-1 Data ownership by CHUNK and NTCHUNK blocks
CHUNK is the number of iterations a thread works on at one time. The
idea is to make a thread work on the same elements of F from one
iteration of I to the next (except for those that are already complete).
NTCHUNK 1
NTCHUNK 2
thread 0
thread 1
thread 2
thread 7
thread 0
thread 1
F(17) ... F(24)
F(25) ... F(32)
F(1) ... F(8)
F(9) ... F(16)
F(33) ... F(40)
...
CHUNKs of F
Associated
F(41) ... F(48)
F(49) ... F(56)
F(57) ... F(64)
F(65) ... F(72)
F(73) ... F(80)
F(81) ...
thread 5
thread 3
thread 4
thread 6
thread
CHUNKs of F
Associated
thread