Parallel Programming Guide for HP-UX Systems

Introduction to parallel environments
Programming Methods
Chapter 16
Programming Methods
High performance programming methods address systems of one or more processors that can
be distributed within an SMP system or over the nodes of a cluster. These methods include:
standard serial optimizations and library calls; auto-parallelization offered by some
compilers, OpenMP directives, calls to the POSIX threads library, and calls to the message
passing interface (MPI).
The overall parallel high performance program that takes advantage of multiple processors is
still running a collection of single processor programs concurrently. Therefore, standard serial
optimizations aimed at uniprocessor performance are important, including loop unrolling,
cache blocking, and other coding techniques that allow the compilers to better optimize and
pipeline programs.
By using multi-threading (OpenMP directives or Pthreads calls) and message passing (calls to
an MPI library), the developer then achieves concurrency, where multiple parts of the
program run simultaneously to achieve factors of performance improvement. Compiler
auto-parallelization can be used to automatically generate multi-threading.
HP provides an MPI library that runs optimally on HP clusters in the HP-UX PA-RISC and
Itanium, Linux IA32 and Itanium platform pairs. HP MPI is reliable, thread-safe, and fully
compliant with the MPI-2 standard. HP MPI supports a variety of interconnect fabrics
including TCP/IP, Quadrics Elan, Infiniband, and intranode communication. HP Visual MPI,
a companion product of HP MPI, is an analysis tool that provides error detection and
statistical analysis to highlight issues for improving performance.