Parallel Programming Guide for HP-UX Systems
MPI
Tuning
Chapter 270
Coding considerations
The following are suggestions and items to consider when coding your MPI applications to
improve performance:
• Use HP MPI collective routines instead of coding your own with point-to-point routines
because HP MPI’s collective routines are optimized to use shared memory where possible
for performance.
• Use commutative MPI reduction operations.
— Use the MPI predefined reduction operations whenever possible because they are
optimized.
— When defining your own reduction operations, make them commutative.
Commutative operations give MPI more options when ordering operations allowing it
to select an order that leads to best performance.
• Use MPI derived datatypes when you exchange several small size messages that have no
dependencies.
• Minimize your use of MPI_Test() polling schemes to minimize polling overhead.
• Code your applications to avoid unnecessary synchronization. In particular, strive to avoid
MPI_Barrier calls. Typically an application can be modified to achieve the same end
result using targeted synchronization instead of collective calls. For example, in many
cases a token-passing ring may be used to achieve the same coordination as a loop of
barrier calls.