Parallel Programming Guide for HP-UX Systems

MPI
Debugging
Chapter 2 61
Running
Run time problems originate from many sources and may include:
Shared memory
Message buffering
Propagation of environment variables
Interoperability
Fortran 90 programming features
UNIX open file descriptors
External input and output
Shared memory When an MPI application starts, each MPI process attempts to allocate a
section of shared memory. This allocation can fail if the system-imposed limit on the
maximum number of allowed shared-memory identifiers is exceeded or if the amount of
available physical memory is not sufficient to fill the request.
After shared-memory allocation is done, every MPI process attempts to attach to the
shared-memory region of every other process residing on the same host. This attachment can
fail if the number of shared-memory segments attached to the calling process exceeds the
system-imposed limit. In this case, use the MPI_GLOBMEMSIZE environment variable to reset
your shared-memory allocation.
Furthermore, all processes must be able to attach to a shared-memory region at the same
virtual address. For example, if the first process to attach to the segment attaches at address
ADR, then the virtual-memory region starting at ADR must be available to all other
processes. Placing MPI_Init to execute first can help avoid this problem. A process with a
large stack size is also prone to this failure. Choose process stack size carefully.
Message buffering According to the MPI standard, message buffering may or may not
occur when processes communicate with each other using MPI_Send. MPI_Send buffering is at
the discretion of the MPI implementation. Therefore, you should take care when coding
communications that depend upon buffering to work correctly.
For example, when two processes use MPI_Send to simultaneously send a message to each
other and use MPI_Recv to receive the messages, the results are unpredictable. If the
messages are buffered, communication works correctly. If the messages are not buffered,
however, each process hangs in MPI_Send waiting for MPI_Recv to take the message. For