Parallel Programming Guide for HP-UX Systems

Troubleshooting

False cache line sharing

Chapter 9180

ENDDO

C$DIR END_TASKS

Working with unaligned arrays

The most common cache-thrashing complication using arrays and loops

occurs when arrays assigned within a loop are unaligned with each other.

There are several possible causes for this:

• Arrays that are local to a routine are allocated on the stack.

• Array dummy arguments might be passed an element other than the

ﬁrst in the actual argument.

• Array elements might be assigned with different offset indexes.

Consider the following Fortran code:

COMMON /OKAY/ X(112,100)

...

CALL UNALIGNED (X(I,J))

...

SUBROUTINE UNALIGNED (Y)

REAL*4 Y(*)

! Y(1) PROBABLY NOT ON A CACHE LINE BOUNDARY

The address of Y(1) is unknown. However, if elements of Y are heavily

assigned in this routine, it may be worthwhile to compute an alignment,

given by the following formula:

LREM = LSIZE - ( (

( LOC(Y(1))-4, LSIZE*x) + 4) /x)

where

LSIZE is the appropriate cache line size in words

x is the data size for elements of Y

For this case, LSIZE on V2250 servers is 32 bytes in single precision

words (8 words). Note that:

( ( MOD ( LOC(Y(1))-4, LSIZE*4) + 4) /4)

returns a value in the set 1, 2, 3, ..., LSIZE,soLREM is in the range 0 to 7.

Then a loop such as: