Parallel Programming Guide for HP-UX Systems

Memory classes
Memory class assignments
Chapter 7130
The C/C++ double data type provides the same precision as Fortran’s
REAL*8. The thread_private data declared here occupies the same
amount of memory as that declared in the Fortran example. tpa is
available to all functions lexically following it in the file. tpb is local to
func and inaccessible to other functions. tpc, a, and b are declared at
filescope in another file that is linked with this one.
Assume a Fortran or C program containing the appropriate example is
running on a 4-hypernode subcomplex with 16 processors per hypernode
and the thread_private memory is allocated from node_private
memory (see the mpa(1) man page). Each data item requires 16 virtual
addresses, for a total of 384,256 bytes of virtual space. These virtual
addresses map to 16 physical addresses per hypernode, or 64 total
physical addresses per data item, requiring a total of 1,537,024 (64 x
24016) bytes of physical memory.
Example 7-3 thread_private COMMON blocks in parallel subroutines
Data local to a procedure that is called in parallel is effectively private
because storage for it is allocated on the thread’s private stack. However,
if the data is in a Fortran COMMON block (or if it appears in a DATA or SAVE
statement), it is not stored on the stack. Parallel accesses to such
nonprivate data must be synchronized if it is assigned a shared class.
Additionally, if the parallel copies of the procedure do not need to share
the data, it can be assigned a private class.
Consider the following Fortran example:
INTEGER A(1000,1000)
.
.
.
C$DIR LOOP_PARALLEL(THREADS)
DO I = 1, N
CALL PARCOM(A(1,I))
.
.
.
ENDDO
SUBROUTINE PARCOM(A)
INTEGER A(*)
INTEGER C(1000), D(1000)
COMMON /BLK1/ C, D
C$DIR THREAD_PRIVATE(/BLK1/)
INTEGER TEMP1, TEMP2
D(1:1000) = ...
.
.
.
CALL PARCOM2(A, JTA)