Parallel Programming Guide for HP-UX Systems

Parallel synchronization
Synchronizing code
Chapter 8164
COMMON /DONE/ DONEIN, DONECOMP
COMPDONE= DONECOMP
END
Notice that the gates are accessed through COMMON blocks. Each thread
that calls this subroutine allocates a thread_private WORK array.
This subroutine contains a loop that tests INDONE().
The loop copies the input queue into the local WORK array, then does a
significant amount of computational work that has been omitted for
simplicity.
NOTE The computational work is the main code that executes in
parallel, if there is not a large amount of it, the overhead
of setting up these parallel tasks and critical sections
cannot be justified.
The loop encompasses this computation, and also the section of code
that copies the WORK array to the output queue.
This construct allows final output to be written after all input has
been used in computation.
To avoid accessing the input queue while it is being filled or accessed
by another thread, the section of code that copies it into the local
WORK array is protected by a critical section.
NOTE This section must be unconditionally locked as the
computational threads cannot do something else until they
receive their input.
Once the input queue has been copied, THREAD_WRK can perform its large
section of computational code in parallel with whatever the other tasks
are doing. After the computational section is finished, another
unconditional critical section must be entered so that the results are
written to the output queue. This prevents two threads from accessing
the output queue at once.