(3M) Calculator User Manual

ManualsBrandsScotch Brand ManualsCalculator5.1.10

The free/libre software license under which Scotch 5.1 is distributed is

the CeCILL-C license [6], which has bas ic ally the same features a s the GNU

LGPL (“Lesser General Pu blic License”): ability to link the code as a library

to any free/libre or even pro prietary software, ability to modify the code a nd to

redistribute these modiﬁcations. Version 4.0 of Scotch was distributed under the

LGPL itself.

Please refer to se c tion 8 to s e e how to obtain the free/libre distribution of

Scotch.

3 Algorithms

3.1 Static mapping by Dual Recursive Bipartitioning

For a detailed description of the mapping algo rithm and an extensive analysis of its

performance, please refer to [41, 44]. In the next se c tions, we will only outline the

most important aspects of the algorithm.

3.1.1 Static mapping

The parallel program to be mapped onto the targe t a rchitecture is modeled by a val-

uated unoriented graph S called source graph or process graph, the vertices of which

represent the processes of the parallel program, and the edges of which the commu-

nication channels between communicating processes. Vertex- and edge- valuations

associate with every vertex v

and eve ry edge e

of S integer numbers w

) and

) which estimate the computation weight of the corresp onding process and

the amount of communication to be transmitted on the channel, respectively.

The target machine onto which is mapped the parallel program is also modeled

by a valuated unoriented graph T called target graph or architecture graph. Vertices

and edges e

of T are assigned integer weights w

) and w

), which

estimate the computational power of the corre sponding processor and the cost of

traver sal of the inter-process or link, respectively.

A mapping from S to T consists of two applications τ

S,T

: V (S) −→ V (T ) a nd

S,T

: E(S) −→ P(E(T )), where P(E(T )) denotes the set of all simple loopless

paths which can be built from E(T ). τ

S,T

) = v

if process v

of S is mapped

onto processor v

of T , and ρ

S,T

) = { e

, e

, . . . , e

} if communication channel

of S is routed through communication links e

, e

, . . . , e

of T . |ρ

S,T

denotes the dilation of edge e

, that is, the number of edges of E(T ) used to route

3.1.2 Cost function and performance criteria

The computation of eﬃcient static mappings requires an a priori knowledge of the

dynamic behavior of the target machine with respect to the programs which are

run on it. This knowledge is synthesized in a cost function, the nature of which

determines the characteristics of the desired optimal mappings. The goa l of our

mapping algorithm is to minimize some communication cost function, while keeping

the load balance within a speciﬁed tolerance. The communication cost function f

that we have chosen is the sum, for all edges, of their dilation multiplied by their

weight:

(τ

S,T

, ρ

S,T

)

def

∈E(S)

) |ρ

S,T

)| .