(3M) Calculator User Manual
This function, which has already been considered by several authors for hyper-
cube target topologies [11, 21, 25], has several interesting properties: it is easy
to compute, allows incremental updates performed by iterative algorithms, and its
minimization favors the mapping of intensively intercommunicating processes onto
nearby process ors; regardless of the type of routage implemented on the target
machine (store- and-forward or cut-through), it models the traffic on the intercon-
nection network and thus the risk o f co ngestion.
The strong p ositive correlation between values of this function and effective
execution times has been experimentally verified by Hammond [21] on the CM-2,
and by Hendrickson and Leland [26] on the nCUBE 2.
The quality of mappings is evaluated with respect to the criteria for quality that
we have chosen: the balance of the computation load across processors, and the
minimization of the interprocessor c ommunication cost modeled by function f
C
.
These criteria lead to the definition of several parameters, which are desc rib e d
below.
For load balance, one can define µ
map
, the average load per computational
power unit (which does not depend on the mapping), and δ
map
, the load imbalance
ratio, as
µ
map
def
=
P
v
S
∈V (S)
w
S
(v
S
)
P
v
T
∈V (T )
w
T
(v
T
)
and
δ
map
def
=
P
v
T
∈V (T )
1
w
T
(v
T
)
P
v
S
∈ V (S)
τ
S,T
(v
S
) = v
T
w
S
(v
S
)
− µ
map
P
v
S
∈V (S)
w
S
(v
S
)
.
However, since the maximum load imbalance ratio is provided by the user in input
of the mapping, the information given by these parameters is of little interest, since
what matters is the minimization of the c ommunication c ost function under this
load balance cons traint.
For communication, the stra ightforward parameter to consider is f
C
. It can be
normalized as µ
exp
, the average edge expansion, which can be compared to µ
dil
,
the average edge dilation; these are defined as
µ
exp
def
=
f
C
P
e
S
∈E(S)
w
S
(e
S
)
and µ
dil
def
=
P
e
S
∈E(S)
|ρ
S,T
(e
S
)|
|E(S)|
.
δ
exp
def
=
µ
exp
µ
dil
is smaller than 1 when the mapper succeeds in putting heavily inter-
communicating processes closer to each other than it does for lightly communicating
processes; they ar e e qual if all edges have same weight.
3.1.3 The Dual Recursive Bipartitioning algorithm
Our mapping algorithm uses a divide and conquer approach to recursively allocate
subsets of processes to subsets of processors [41]. It starts by considering a set of
processors, also called domain, containing all the process ors of the target machine,
and with which is asso c iated the set of all the process e s to map. At each step, the
algorithm bipartitions a yet unproce ssed domain into two disjoint subdomains, and
9