User guide
6–SHMEM Description and Configuration
Progress Model
6-12 IB0054606-02 A
Active Progress
In the active progress mode SHMEM progress is achieved when the application 
calls into the SHMEM library. This approach is well matched to applications that 
call into SHMEM frequently, for example, to have a fine grained mix of SHMEM 
operations and computation. This mix is typical of many SHMEM applications. 
Applications that spend large amount of contiguous time in computation without 
calling SHMEM routines will cause SHMEM progress to be delayed for that period 
of time. Additionally, applications must not poll on locations waiting for puts to 
arrive without calling SHMEM, since progress will not occur and the program will 
hang. Instead, SHMEM applications should use one of the wait synchronization 
primitives provided by SHMEM. In active progress mode QLogic SHMEM will 
achieve full performance. 
Passive Progress
In the passive progress mode SHMEM progress will continue to occur when the 
application calls into SHMEM, but can additionally occur in the background when 
the application is not calling into SHMEM. This is achieved using an additional 
progress thread per PE. The progress thread is provided by PSM and is 
scheduled at a relatively low frequency, typically 10 to 100 times a second. This 
thread will cause independent SHMEM progress where required, both on the 
initiator side and the target side of SHMEM operations. In this mode applications 
can poll on locations waiting for puts to arrive without calling SHMEM. Progress 
will be achieved in this case by the progress thread, though it will incur the 
scheduling latency for the progress thread which may have a significant impact on 
overall performance if this idiom is used frequently. The scheduling frequency of 
the PSM progress thread can be tuned as described in the Environment Variables 
section.
Other performance effects of using passive progress include the following:
 The progress thread consumes some CPU cycles, though this is low 
because the progress thread runs infrequently.
 The SHMEM library uses additional locks in its implementation to protect its 
data structures against concurrent updates from the PE thread and the 
progress thread. There is a slight additional cost in the performance critical 
path because of this locking. This cost is minimal because contention on the 
lock is very low (the progress thread runs infrequently) and because each 
progress thread runs on the same CPU core as the corresponding PE 
thread (giving good cache locality for the lock).










