(3M) Calculator User Manual
The free/libre software license under which Scotch 5.1 is distributed is
the CeCILL-C license [6], which has bas ic ally the same features a s the GNU
LGPL (“Lesser General Pu blic License”): ability to link the code as a library
to any free/libre or even pro prietary software, ability to modify the code a nd to
redistribute these modifications. Version 4.0 of Scotch was distributed under the
LGPL itself.
Please refer to se c tion 8 to s e e how to obtain the free/libre distribution of
Scotch.
3 Algorithms
3.1 Static mapping by Dual Recursive Bipartitioning
For a detailed description of the mapping algo rithm and an extensive analysis of its
performance, please refer to [41, 44]. In the next se c tions, we will only outline the
most important aspects of the algorithm.
3.1.1 Static mapping
The parallel program to be mapped onto the targe t a rchitecture is modeled by a val-
uated unoriented graph S called source graph or process graph, the vertices of which
represent the processes of the parallel program, and the edges of which the commu-
nication channels between communicating processes. Vertex- and edge- valuations
associate with every vertex v
S
and eve ry edge e
S
of S integer numbers w
S
(v
S
) and
w
S
(e
S
) which estimate the computation weight of the corresp onding process and
the amount of communication to be transmitted on the channel, respectively.
The target machine onto which is mapped the parallel program is also modeled
by a valuated unoriented graph T called target graph or architecture graph. Vertices
v
T
and edges e
T
of T are assigned integer weights w
T
(v
T
) and w
T
(e
T
), which
estimate the computational power of the corre sponding processor and the cost of
traver sal of the inter-process or link, respectively.
A mapping from S to T consists of two applications τ
S,T
: V (S) −→ V (T ) a nd
ρ
S,T
: E(S) −→ P(E(T )), where P(E(T )) denotes the set of all simple loopless
paths which can be built from E(T ). τ
S,T
(v
S
) = v
T
if process v
S
of S is mapped
onto processor v
T
of T , and ρ
S,T
(e
S
) = { e
1
T
, e
2
T
, . . . , e
n
T
} if communication channel
e
S
of S is routed through communication links e
1
T
, e
2
T
, . . . , e
n
T
of T . |ρ
S,T
(e
S
)|
denotes the dilation of edge e
S
, that is, the number of edges of E(T ) used to route
e
S
.
3.1.2 Cost function and performance criteria
The computation of efficient static mappings requires an a priori knowledge of the
dynamic behavior of the target machine with respect to the programs which are
run on it. This knowledge is synthesized in a cost function, the nature of which
determines the characteristics of the desired optimal mappings. The goa l of our
mapping algorithm is to minimize some communication cost function, while keeping
the load balance within a specified tolerance. The communication cost function f
C
that we have chosen is the sum, for all edges, of their dilation multiplied by their
weight:
f
C
(τ
S,T
, ρ
S,T
)
def
=
X
e
S
∈E(S)
w
S
(e
S
) |ρ
S,T
(e
S
)| .
8