Scotch and libScotch 5.1 User’s Guide (version 5.1.10) François Pellegrini Bacchus team, INRIA Bordeaux Sud-Ouest ENSEIRB & LaBRI, UMR CNRS 5800 Université Bordeaux I 351 cours de la Libération, 33405 TALENCE, FRANCE pelegrin@labri.fr August 29, 2010 Abstract This document describes the capabilities and operations of Scotch and libScotch, a software package and a software library devoted to static mapping, partitioning, and sparse matrix block ordering of graphs and meshes/hypergraphs.
Contents 1 Introduction 1.1 Static mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Sparse matrix ordering . . . . . . . . . . . . . . . . . . . . . . . . . . 1.3 Contents of this document . . . . . . . . . . . . . . . . . . . . . . . . 6 6 6 7 2 The Scotch project 2.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Availability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 7 7 3 Algorithms 3.
6.3.5 6.3.6 6.3.7 6.3.8 6.3.9 6.3.10 6.3.11 6.3.12 6.3.13 6.3.14 6.3.15 6.3.16 6.3.17 gcv . . . . . gmap / gpart gmk * . . . . gmk msh . . . gmtst . . . . gord . . . . . gotst . . . . gout . . . . . gtst . . . . . mcv . . . . . mmk * . . . . mord . . . . . mtst . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.5.3 SCOTCH graphFree . . . . . . . . . 7.5.4 SCOTCH graphLoad . . . . . . . . . 7.5.5 SCOTCH graphSave . . . . . . . . . 7.5.6 SCOTCH graphBuild . . . . . . . . 7.5.7 SCOTCH graphBase . . . . . . . . . 7.5.8 SCOTCH graphCheck . . . . . . . . 7.5.9 SCOTCH graphSize . . . . . . . . . 7.5.10 SCOTCH graphData . . . . . . . . . 7.5.11 SCOTCH graphStat . . . . . . . . . 7.6 Graph mapping and partitioning routines 7.6.1 SCOTCH graphPart . . . . . . . . . 7.6.2 SCOTCH graphMap . . . . . . . . . . 7.6.
7.11 7.12 7.13 7.14 7.10.2 SCOTCH stratExit . . . . . . . . 7.10.3 SCOTCH stratSave . . . . . . . . 7.10.4 SCOTCH stratGraphBipart . . . 7.10.5 SCOTCH stratGraphMap . . . . . 7.10.6 SCOTCH stratGraphMapBuild . . 7.10.7 SCOTCH stratGraphOrder . . . . 7.10.8 SCOTCH stratGraphOrderBuild 7.10.9 SCOTCH stratMeshOrder . . . . . 7.10.10 SCOTCH stratMeshOrderBuild . Geometry handling routines . . . . . . . 7.11.1 SCOTCH geomInit . . . . . . . . . 7.11.2 SCOTCH geomExit . . . . . . . . . 7.11.3 SCOTCH geomData .
1 1.1 Introduction Static mapping The efficient execution of a parallel program on a parallel machine requires that the communicating processes of the program be assigned to the processors of the machine so as to minimize its overall running time. When processes have a limited duration and their logical dependencies are accounted for, this optimization problem is referred to as scheduling.
1.3 Contents of this document This document describes the capabilities and operations of Scotch, a software package devoted to static mapping, graph and mesh partitioning, and sparse matrix block ordering. Scotch allows the user to map efficiently any kind of weighted process graph onto any kind of weighted architecture graph, and provides highquality block orderings of sparse matrices. The rest of this manual is organized as follows.
The free/libre software license under which Scotch 5.1 is distributed is the CeCILL-C license [6], which has basically the same features as the GNU LGPL (“Lesser General Public License”): ability to link the code as a library to any free/libre or even proprietary software, ability to modify the code and to redistribute these modifications. Version 4.0 of Scotch was distributed under the LGPL itself. Please refer to section 8 to see how to obtain the free/libre distribution of Scotch. 3 3.
This function, which has already been considered by several authors for hypercube target topologies [11, 21, 25], has several interesting properties: it is easy to compute, allows incremental updates performed by iterative algorithms, and its minimization favors the mapping of intensively intercommunicating processes onto nearby processors; regardless of the type of routage implemented on the target machine (store-and-forward or cut-through), it models the traffic on the interconnection network and thus the
calls a graph bipartitioning algorithm to split the subset of processes associated with the domain across the two subdomains, as sketched in the following. mapping (D, P) Set_Of_Processors D; Set_Of_Processes P; { Set_Of_Processors D0, D1; Set_Of_Processes P0, P1; if (|P| == 0) return; if (|D| == 1) { result (D, P); return; } /* If nothing to do. */ /* If one processor in D */ /* P is mapped onto it.
3.1.4 Partial cost function The production of efficient complete mappings requires that all graph bipartitionings favor the criteria that we have chosen.
neighbor processes, whether they are handled by the job itself or not, since it can estimate in fC′ the dilation of the corresponding edges. This results in an interesting feedback effect: once an edge has been kept in a cut between two subdomains, the distance between its end vertices will be accounted for in the partial communication cost function to be minimized, and following jobs will be more likely to keep these vertices close to each other, as illustrated in Figure 2.
space that derives from the one which was globally defined at the coarsest level, thus preventing local optimization refinement algorithms to be trapped in local optima of the finer graphs [8]. Diffusion This global optimization method, presented in [42], flows two kinds of antagonistic liquids, scotch and anti-scotch, from two source vertices, and sets the new frontier as the limit between vertices which contain scotch and the ones which contain anti-scotch.
tree rooted at a randomly chosen vertex, and this process is iterated by selecting a new root vertex within the last layer as long as the number of layers increases. Then, starting from the current root vertex, vertices are assigned layer after layer to the first subdomain, until half of the total weight has been processed. Remaining vertices are then allocated to the second subdomain.
for it to account for topological structures of the graph that would else be of a too high level for it to encompass in its local optimization process. 3.1.7 Mapping onto variable-sized architectures Several constrained graph partitioning problems can be modeled as mapping the problem graph onto a target architecture, the number of vertices and topology of which depend dynamically on the structure of the subgraphs to bipartition at each step.
vertex separators are computed by using direct heuristics [28, 37], or from edge separators [48, and included references] by minimum cover techniques [9, 30], but other techniques such as spectral vertex partitioning have also been used [49]. Provided that good vertex separators are found, the nested dissection algorithm produces orderings which, both in terms of fill-in and operation count, compare favorably [19, 31, 46] to the ones obtained with the minimum degree algorithm [39].
measure its quality, several parameters can be defined: hmin , hmax , and havg denote the minimum, maximum, and average heights of the tree1 , respectively, and hdlt is the variance, expressed as a percentage of havg . Since small separators result in small chains in the elimination tree, havg should also indirectly reflect the quality of separators. 3.2.
3.2.6 Graph separation methods The core of our incomplete nested dissection algorithm uses graph separation methods as black boxes. It allows the orderer to run any type of graph separation method compatible with our criteria for quality, that is, reducing the size of the vertex separator while maintaining the loads of the separated parts within some user-specified tolerance.
Scotch can now handle compressed streams on the fly, in several widely used formats such as gzip, bzip2 or lzma. Please refer to Section 6.2 for more information. 4.2 Changes from version 5.0 A new integer index type has been created in the Fortran interface, to address array indices larger than the maximum value which can be stored in a regular integer. Please refer to Section 8.3 for more information.
The numeric flag, similar to the one used by the Chaco graph format [24], is made of three decimal digits. A non-zero value in the units indicates that vertex weights are provided. A non-zero value in the tenths indicates that edge weights are provided. A non-zero value in the hundredths indicates that vertex labels are provided; if it is the case, vertices can be stored in any order in the file; else, natural order is assumed, starting from the graph base index.
The third line holds three figures: the base index of the first element vertex in memory (velmbas), the base index of the first node vertex in memory (vnodbas), and a numeric flag. The Scotch mesh file format requires that all nodes and all elements be assigned to contiguous ranges of indices. Therefore, either all element vertices are defined before all node vertices, or all node vertices are defined before all element vertices.
4 1 3 1 4 4 4 1 2 2 2 1 1 1 2 10 2 5 11 1 6 7 3 8 9 8 4 2 7 5 2 2 1 1 3 3 2 2 (= 5) (= 10) (= 8) 24 000 8 (= 11) 2 (= 5) 6 (= 9) 4 8 3 (= 7) (= 11) (= 6) 3 1 4 (= 6) (= 4) (= 7) 1 3 3 1 Figure 5: Mesh file representing three square elements, with unity vertex and edge weights. Elements are defined before nodes, and numbering of the underlying graph starts from 1. The left part of the figure shows the mesh representation in memory, with consecutive element and node indices.
allowed to do so because, in our approach, the recursive bipartitioning of the target graph is fully independent with respect to that of the source graph (however, the opposite is false). For space and time saving issues, some classical homogeneous architectures (2D and 3D meshes and tori, hypercubes, complete graphs, etc.) have been algorithmically coded within the mapper itself by the means of built-in functions.
5.4.2 Algorithmically-coded architecture files Almost all algorithmically-coded architectures are defined with unity edge and vertex weights. They start with an abbreviation name of the architecture, followed by parameters specific to the architecture. The available built-in architecture definitions are listed below. cmplt size Defines a complete graph with size vertices. The vertex labels are numbers between 0 and size − 1. cmpltw size load0 load1 . . .
mesh2D dimX dimY Defines a bidimensional array of dimX columns by dimY rows. The vertex with coordinates (posX, posY) has label posY × dimX + posX. mesh3D dimX dimY dimZ Defines a tridimensional array of dimX columns by dimY rows by dimZ levels. The vertex with coordinates (posX,posY,posZ ) has label (posZ × dimY + posY) × dimX + posX. torus2D dimX dimY Defines a bidimensional array of dimX columns by dimY rows, with wraparound edges. The vertex with coordinates (posX, posY) has label posY × dimX + posX.
of the target graph vertex onto which it is mapped. Mapping pairs can be stored in any order in the file; however, labels of source graph vertices must be all different. For example, Figure 9 shows the result obtained when mapping the source graph of Figure 4 onto the target architecture of Figure 7. This one-to-one embedding of H(3) into UB(2, 3) has dilation 1, except for one hypercube edge which has dilation 3.
8 0 1 2 3 4 5 6 7 6 3 2 7 1 5 4 0 Figure 10: Ordering file obtained when reordering the hypercube graph of Figure 4. Vertex lists are coded as lists of integer numbers. The first integer is the number of vertices in the list and the other integers are the labels of the selected vertices, given in any order. For example, Figure 11 shows the list made from three vertices of labels 2, 45, and 7. 3 2 45 7 Figure 11: Example of vertex list with three vertices of labels 2, 45, and 7.
External mesh file External graph file mcv gcv mmk_* gmk_* Geometry file amk_* .xyz Source mesh file gmk_msh .msh mtst Source graph file Target file amk_grf .grf mord gord .tgt gtst gmap Ordering file Mapping file .ord .map gotst acpl gout atst gmtst File Graphics file Program Data flow Figure 12: General architecture of the Scotch project. All of the features offered by the stand-alone programs are also available in the libScotch library.
A brief on-line help is provided with all the programs. To get this help, use the “-h” option after the program name. The case of option letters is not significant, except when both the lower and upper cases of a letter have different meanings. When passing parameters to the programs, only the order of file names is significant; options can be put anywhere in the command line, in any order. Examples of use of the different programs of the Scotch project are provided in section 9.
However, compiled architecture files are read much more efficiently, as they are directly loaded into memory without further processing. Since the compilation time of a target architecture graph evolves as the square of its number of vertices, precompiling with acpl can save some time when many mappings are to be performed onto the same large target architecture. Options 6.3.2 -h Display the program synopsis. -V Print the program version and copyright.
Program amk hy outputs the target architecture file of a hypercube graph of dimension dim. Vertices are labeled according to the decimal value of their binary representation. The decomposition-defined target architectures computed by amk hy do not exactly give the same results as the built-in hypercube targets because distances are not computed in the same manner, although the two recursive bipartitionings are identical. To achieve best performance and save space, use the built-in architecture.
6.3.3 amk grf Synopsis amk grf [input graph file [output target file]] options Description The program amk grf turns a source graph file into a decomposition-defined target file. It computes a recursive bipartitioning of the source graph, as well as the array of distances between all pairs of its vertices, both of which are combined to give a decomposition-defined target architecture of same topology as the input source graph.
6.3.4 atst Synopsis atst [input target file [output data file]] options Description The program atst is the architecture tester. It gives some statistics on decomposition-defined target architectures, and in particular the minimum, maximum, and average communication costs (that is, weighted distance) between all pairs of processors. Options 6.3.5 -h Display the program synopsis. -V Print the program version and copyright.
c m s -V Chaco v1.0/MeTiS format. The Matrix Market format. Scotch source graph format. Print the program version and copyright. Default option set is “-Ib0 -Os”. 6.3.6 gmap / gpart Synopsis gmap [input graph file [input target file [output mapping file [output log file]]]] options gpart number of parts [input graph file [output mapping file [output log file]]] options Description The program gmap is the graph mapper.
-brat Set the maximum load imbalance ratio to rat, which should be a value comprised between 0 and 1. This option can be used in conjunction with option -c, but is incompatible with option -m. -cflags Tune the default mapping strategy according to the given preference flags. Some of these flags are antagonistic, while others can be combined. See Section 7.3.1 for more information. The currently available flags are the following. b q s t Enforce load balance as much as possible.
Description The gmk * programs make source graphs. Each of them is devoted to a specific topology, for which it builds target graphs of any dimension. The gmk * programs are mainly used in conjunction with amk grf. Most gmk * programs build source graphs describing parallel machines, which are used by amk grf to generate corresponding target sub-architectures, by means of its -l option.
-h Display the program synopsis. -V Print the program version and copyright. 6.3.9 gmtst Synopsis gmtst [input graph file [input target file [input mapping file [output data file]]]] options Description The program gmtst is the graph mapping tester. It outputs some statistics on the given mapping, regarding load balance and inter-processor communication. The two first statistics lines deal with process mapping statistics, while the following ones deal with communication statistics.
Description The gord program is the block sparse matrix graph orderer. It uses an ordering strategy to compute block orderings of sparse matrices represented as source graphs, whose vertex weights indicate the number of DOFs per node (if this number is non homogeneous) and whose edges are unweighted, in order to minimize fill-in and operation count.
When the geometry of the graph is available, this mapping file may be processed by program gout to display the vertex separators and supervariable amalgamations that have been computed. -ostrat Apply ordering strategy strat. The format of ordering strategies is defined in section 7.3.4. -toutput tree file Write to output tree file the structure of the separator tree.
6.3.12 gout Synopsis gout [input graph file [input geometry file visualization file]]]] options [input mapping file [output Description The gout program is the graph, matrix, and mapping viewer program. It takes on input a source graph, its geometry file, and optionally a mapping result file, and produces a file suitable for display.
a. A subgraph of M2 (4, 4) to be mapped onto K(2). b. Mapping result displayed on the full M2 (4, 4) graph. c. Mapping result displayed on another subgraph of M2 (4, 4). Figure 14: PostScript diplay of a single mapping file with different subgraphs of the same source graph. Vertices covered with disks of the same color are mapped onto the same processor. -oformat[{parameters}] Specify the type of output, with optional parameters within curly braces and separated by commas.
Figure 15: Snapshot of an Open Inventor display of a sphere partitioned into 7 almost equal pieces by mapping onto the complete graph with 7 vertices. Vertices of same color are mapped onto the same processor. Avoid displaying the mapping disks. Opposite of d. Color PostScript output, using 16 different colors. Opposite of g. d Display the mapping disks. Opposite of a. e Encapsulated PostScript output, suitable for LATEX use with epsf. Opposite of f.
-V Print the program version and copyright. Default option set is “-Oi{v}”. 6.3.13 gtst Synopsis gtst [input graph file [output data file]] options Description The program gtst is the source graph tester. It checks the consistency of the input source graph structure (matching of arcs, number of vertices and edges, etc.), and gives some statistics regarding edge weights, vertex weights, and vertex degrees.
s Scotch source mesh format. -oformat Specify the output graph format. The available output formats are listed below. s Scotch source graph format. Print the program version and copyright. -V Default option set is “-Ib0 -Os”. 6.3.15 mmk * Synopsis mmk m2 dimX [dimY [output mesh file]] options mmk m3 dimX [dimY [dimZ [output mesh file]]] options Description The mmk * programs make source meshes.
Since its main purpose is to provide orderings that exhibit high concurrency for parallel block factorization, it comprises a nested dissection method [17], but classical [39] and state-of-the-art [1, 47] minimum degree algorithms are implemented as well. Ordering methods are used to define ordering strategies by means of selection, grouping, and condition operators. The -o option allows the user to define the ordering strategy.
a root of the separator tree (there can be several roots, if the mesh is disconnected). Combined to the column block mapping data produced by option -m, the tree structure allows one to rebuild the separator tree. Print the program version and copyright. -V -vverb Set verbose mode to verb, which may contain several of the following switches. s t 6.3.17 Strategy information. This parameter displays the default ordering strategy used by mord. Timing information.
• error handling routines, which allow the user either to provide his own error servicing routines, or to use the default routines provided in the libScotch distribution. A MeTiS compatibility library, called libscotchmetis.a, is also available. It allows users who were previously using MeTiS in their software to take advantage of the efficieny of Scotch without having to modify their code. The services provided by this library are described in Section 7.14. 7.1 7.1.
exists a SCOTCHFTYPEACTION () Fortran counterpart, in which the separating underscore character is replaced by an “F”. In most cases, the Fortran routines have exactly the same parameters as the C functions, save for an added trailing INTEGER argument to store the return value yielded by the function when the return type of the C function is not void. Since all the data structures used in libScotch are opaque, equivalent declarations for these structures must be provided in Fortran.
“-lscotch -lscotcherr -lm”. If you want to handle errors by yourself, you should not link with library file libscotcherr.a, but rather provide a SCOTCH errorPrint() routine. Please refer to section 7.12 for more information. 7.1.4 Machine word size issues Graph indices are represented in Scotch as integer values of type SCOTCH Num. By default, this type equates to the int C type, that is, an integer type of size equal to the one of the machine word. However, it can represent any other integer type.
and “Data”, allow callers to retrieve information from opaque structures. In all of the following, whenever arrays are defined, passed, and accessed, it is assumed that the first element of these arrays is always labeled as baseval, whether baseval is set to 0 (for C-style arrays) or 1 (for Fortran-style arrays). Scotch internally manages with base values and array pointers so as to process these arrays accordingly. 7.2.1 Architecture format Target architecture structures are completely opaque.
baseval 1 vertnbr 7 4 3 1 4 4 2 1 2 1 4 1 edgenbr 24 1 2 2 3 1 vlbltab velotab 2 3 4 1 4 4 4 4 4 6 3 4 7 4 1 5 4 vendtab verttab 1 4 10 13 16 19 22 25 edgetab 3 2 6 3 4 1 7 6 5 1 2 4 2 7 3 7 2 6 2 1 5 5 2 4 edlotab 1 1 1 2 2 1 2 3 3 1 2 2 2 1 2 1 3 3 3 1 3 1 2 1 Figure 16: Sample graph and its description by libScotch arrays using a compact edge array.
Dynamic graphs can be handled elegantly by using the vendtab array. In order to dynamically manage graphs, one just has to allocate verttab, vendtab and edgetab arrays that are large enough to contain all of the expected new vertex and edge data. Original vertices are labeled starting from baseval, leaving free space at the end of the arrays.
velmbas 1 vnodbas 4 velmnbr 3 vnodnbr 8 4 10 2 5 11 1 edgenbr 24 6 7 vlbltab 3 velotab 8 9 vendtab verttab 1 5 9 13 14 16 18 20 21 22 23 25 edgetab 5 11 7 6 10 5 11 4 8 9 6 7 2 2 1 1 3 1 3 3 3 2 2 1 Figure 18: Sample mesh and its description by libScotch arrays using a compact edge array. Numbers within vertices are vertex indices. Since the edge array is compact, verttab is of size vertnbr + 1 and vendtab points to verttab + 1.
velmbas 11 vnodbas 1 velmnbr 4 vnodnbr 6 1 11 2 3 12 13 edgenbr 24 4 14 5 6 vlbltab velotab verttab 1 2 5 8 9 12 0 0 0 0 13 16 19 22 edgetab 11 11 12 13 11 12 14 13 13 14 12 14 1 2 3 5 2 3 4 5 2 3 6 5 vendtab 2 5 8 9 12 13 0 0 0 0 16 19 22 25 Figure 19: Sample mesh and its description by libScotch arrays, with nodes numbered first and elements numbered last. In order to allow for dynamic remeshing, empty elements (in grey) have been inserted between existing node and element vertices.
velmbas 9 vnodbas 1 velmnbr 6 vnodnbr 7 1 9 11 2 7 3 12 10 13 edgenbr 36 4 14 5 6 vlbltab velotab verttab 25 2 5 8 27 12 31 0 9 35 13 16 19 22 edgetab 11 12 13 9 10 14 13 1 7 3 14 1 2 7 5 2 7 4 5 2 3 6 5 11 9 13 14 12 10 11 9 12 10 7 3 5 vendtab 27 5 8 9 31 13 35 0 12 38 16 19 22 25 Figure 20: Re-meshing of the mesh of Figure 19.
dimnnbr ≤ 2, and its “z” coordinate is located at geomtab[(i - vnodbas) * dimnnbr + baseval + 2] if dimnnbr = 3. 7.2.5 Block ordering format Block orderings associated with graphs and meshes are described by means of block and permutation arrays, made of SCOTCH Nums, as shown in Figure 21.
permtab 2 3 10 6 4 11 8 7 1 12 5 9 peritab 9 1 2 5 11 4 8 7 12 3 6 10 cblknbr 7 1 2 3 4 2 2 5 6 7 8 4 10 3 6 5 7 11 rangtab 1 2 4 5 6 8 10 13 treetab 3 3 7 6 6 7 −1 9 10 11 12 1 1 8 7 5 9 6 3 12 4 Figure 21: Arrays resulting from the ordering by complete nested dissection of a 4 by 3 grid based from 1. Leftmost grid is the original grid, and righmost grid is the reordered grid, with separators shown and column block indices written in bold.
such as floating point exceptions, which could not be properly handled by the calling software. 7.3.2 Mapping strategy strings At the time being, mapping methods only apply to graphs, as there is not yet a mesh mapping tool in the Scotch package. Mapping strategies are made of methods, with optional parameters enclosed between curly braces, and separated by commas, in the form of method [{parameters}] . The currently available mapping methods are the following. m Multi-level method.
u Untie job pools. Subjobs are stored in the next job pool to be processed. map=tie The tie flag defines how results of bipartitioning jobs are propagated to jobs still in pools. t u Tie both mapping tables together. Results are immediately available to jobs in the same job pool. This is the default behavior. Untie mapping tables. Results are only available to jobs of next pool to be processed. poli=policy Select jobs according to policy policy.
cond1 |cond2 Logical or operator. The result of the condition is true if cond1 or cond2 are true, or both. cond1 &cond2 Logical and operator. The result of the condition is true only if both cond1 and cond2 are true. !cond Logical not operator. The result of the condition is true only if cond is false. var relop val Relational operator, where var is a graph variable, val is either a graph variable or a constant of the type of variable var , and relop is one of ’<’, ’=’, and ’>’.
application of the bnd bipartitioning method to the band graph leads to a situation where both anchor vertices are placed in the same part. width=val Set the width of the band graph. All graph vertices that are at a distance less than or equal to val from any frontier vertex are kept in the band graph. d Diffusion method.
pass=nbr Set the maximum number of optimization passes performed by the algorithm. The Fiduccia-Mattheyses algorithm stops as soon as a pass has not yielded any improvement of the cost function, or when the maximum number of passes has been reached. Value −1 stands for an infinite number of passes, that is, as many as needed by the algorithm to converge. g Gibbs-Poole-Stockmeyer method. This method has only one parameter. pass=nbr Set the number of sweeps performed by the algorithm.
7.3.4 Ordering strategy strings Ordering strategies are available both for graphs and for meshes. An ordering strategy is made of one or several ordering methods, which can be combined by means of strategy operators. The strategy operators that can be used in ordering strategies are listed below, by increasing precedence. (strat ) Grouping operator. The strategy enclosed within the parentheses is treated as a single ordering method. /cond ?strat1 [:strat2]; Condition operator.
of large diagonal blocks are likely to be filled at factoring time. However, in the context of incomplete solving methods such as ILU(k) [29], it can lead to a significant reduction of the required memory space and time, because it helps carving large triangular blocks. The parameters of the blocking method are described below. cmin=size Set the minimum size of the resulting subblocks, in number of columns.
frat=rat Fill-in ratio over which some column block will not amalgamate one of its descendents in the elimination tree. Typical values range from 0.05 to 0.10. f Block Halo Approximate Minimum Fill method. The parameters of the Halo Approximate Minimum Fill method are listed below. cmin=size Minimum number of columns per column block. All column blocks of width smaller than size are amalgamated to their parent column block in the elimination tree, provided that it does not violate the cmax constraint.
ordering strategy is then applied to the derived graph, and this ordering is projected back to the nodes of the mesh. This method is here for evaluation purposes only, as mesh ordering methods are generally more efficient than their graph ordering counterpart. strat=strat Graph ordering strategy to apply to the associated graph. 7.3.5 Node separation strategy strings A node separation strategy is made of one or several node separation methods, which can be combined by means of strategy operators.
levl The level of the subgraph in the separators tree, starting from zero at the root of the tree. Integer. proc The number of processors on which the current subgraph is distributed at this level of the separators tree. This variable is available only when calling from routines of the PT-Scotch parallel library. Integer. rank The rank of the current processor among the group of processors on which the current subgraph is distributed at this level of the separators tree.
bal=val Set the load imbalance tolerance to val, which is a floating-point ratio expressed with respect to the ideal load of the partitions. strat=strat Set the graph bipartitioning strategy that is used to compute the edge bipartition. The syntax of bipartitioning strategy strings is defined within section 7.3.3, at page 59. width=type Select the width of the vertex separators built from edge separators.
computed for coarser graphs or meshes. This strategy is not applied to the coarsest graph or mesh, for which only the low strategy is used. low=strat Set the strategy that is used to compute the vertex separator of the coarsest graph or mesh, at the lowest level of the coarsening process. rat=rat Set the threshold maximum coarsening ratio over which graphs or meshes are no longer coarsened. The ratio of any given coarsening cannot be less that 0.5 (case of a perfect matching), and cannot be greater than 1.
7.4 7.4.1 Target architecture handling routines SCOTCH archInit Synopsis int SCOTCH archInit (SCOTCH Arch * archptr) scotchfarchinit (doubleprecision (*) integer archdat, ierr) Description The SCOTCH archInit function initializes a SCOTCH Arch structure so as to make it suitable for future operations. It should be the first function to be called upon a SCOTCH Arch structure. When the target architecture data is no longer of use, call function SCOTCH archExit to free its internal structures.
The SCOTCH archLoad routine fills the SCOTCH Arch structure pointed to by archptr with the source graph description available from stream stream in the Scotch target architecture format (see Section 5.4). Fortran users must use the PXFFILENO or FNUM functions to obtain the number of the Unix file descriptor fildes associated with the logical unit of the architecture file.
string pointers, the scotchfarchname routine takes as second and third parameters a character() array to be filled with the name of the architecture, and the integer size of the array, respectively. If the array is of sufficient size, a trailing nul character is appended to the string to materialize the end of the string (this is the C style of handling character strings).
When listptr is not NULL and listnbr is greater than zero, the decomposition-defined architecture is restricted to the listnbr vertices whose indices are given in the array pointed to by listtab, from listtab[0] to listtab[listnbr - 1]. These indices should have the same base value as the one of the graph pointed to by grafptr, that is, be in the range from 0 to vertnbr − 1 if the graph base is 0, and from 1 to vertnbr if the graph base is 1.
Description The SCOTCH archCmpltw routine fills the SCOTCH Arch structure pointed to by archptr with the description of a weighted complete graph architecture with vertnbr processors. The relative weights of the processors are given in the velotab array. Once the target architecture has been created, it can be used as input to SCOTCH graphMap to perform weighted graph partitioning.
Return values SCOTCH archMesh2D returns 0 if the 2D mesh target architecture has been successfully built, and 1 else. 7.4.
(i + 1) of each node at level i, and linktab[i] is the cost of communication between processors the first common ancestor of which belongs to this level. See Section 5.4.2, page 24, for an example of such an architecture. Return values SCOTCH archTleaf returns 0 if the tree-leaf target architecture has been successfully built, and 1 else. 7.4.
Return values SCOTCH archTorus3D returns 0 if the 3D torus target architecture has been successfully built, and 1 else. 7.5 7.5.1 Graph handling routines SCOTCH graphInit Synopsis int SCOTCH graphInit (SCOTCH Graph * scotchfgraphinit (doubleprecision (*) integer grafptr) grafdat, ierr) Description The SCOTCH graphInit function initializes a SCOTCH Graph structure so as to make it suitable for future operations. It should be the first function to be called upon a SCOTCH Graph structure.
Description The SCOTCH graphFree function frees the graph data of a SCOTCH Graph structure previously initialized by SCOTCH graphInit, but preserves its internal data structures. This call is equivalent to a call to SCOTCH graphExit immediately followed by a call to SCOTCH graphInit. Consequently, the given SCOTCH Graph structure remains ready for subsequent calls to any routine of the libScotch library. 7.5.
7.5.5 SCOTCH graphSave Synopsis int SCOTCH graphSave (const SCOTCH Graph * FILE * scotchfgraphsave (doubleprecision (*) integer integer grafptr, stream) grafdat, fildes, ierr) Description The SCOTCH graphSave routine saves the contents of the SCOTCH Graph structure pointed to by grafptr to stream stream, in the Scotch graph format (see section 5.1).
Description The SCOTCH graphBuild routine fills the source graph structure pointed to by grafptr with all of the data that are passed to it. baseval is the graph base value for index arrays (typically 0 for structures built from C and 1 for structures built from Fortran). vertnbr is the number of vertices. verttab is the adjacency index array, of size (vertnbr + 1) if the edge array is compact (that is, if vendtab equals verttab + 1 or NULL), or of size vertnbr else.
Description The SCOTCH graphBase routine sets the base of all graph indices according to the given base value, and returns the old base value. This routine is a helper for applications that do not handle base values properly. In Fortan, the old base value is returned in the third parameter of the function call. Return values SCOTCH graphBase returns the old base value. 7.5.
Any of these pointers can be set to NULL on input if the corresponding information is not needed. Else, the reference to a dummy area can be provided, where all unwanted data will be written. This routine is useful to get the size of a graph read by means of the SCOTCH graphLoad routine, in order to allocate auxiliary arrays of proper sizes. If the whole structure of the graph is wanted, function SCOTCH graphData should be preferred. 7.5.
hold the number of arcs (that is, twice the number of edges). edgetab is the pointer to a location that will hold the reference to the adjacency array, of size at least *edgeptr. edlotab is the pointer to a location that will hold the reference to the arc load array, of size *edgeptr. Any of these pointers can be set to NULL on input if the corresponding information is not needed. Else, the reference to a dummy area can be provided, where all unwanted data will be written.
scotchfgraphstat (doubleprecision (*) integer*num integer*num integer*num doubleprecision doubleprecision integer*num integer*num doubleprecision doubleprecision integer*num integer*num integer*num doubleprecision doubleprecision grafdat, velomin, velomax, velosum, veloavg, velodlt, degrmin, degrmax, degravg, degrdlt, edlomin, edlomax, edlosum, edloavg, edlodlt) Description The SCOTCH graphStat routine produces some statistics regarding the graph structure pointed to by grafptr.
The SCOTCH graphPart routine computes a partition into partnbr parts of the source graph structure pointed to by grafptr, using the graph partitioning strategy pointed to by stratptr, and returns the partition data in the array pointed to by parttab. The parttab array should have been previously allocated, of a size sufficient to hold as many SCOTCH Num integers as there are vertices in the source graph. On return, every array cell holds the number of the part to which the corresponding vertex is mapped.
7.6.3 SCOTCH graphMapInit Synopsis int SCOTCH graphMapInit (const SCOTCH Graph * SCOTCH Mapping * const SCOTCH Arch * SCOTCH Num * scotchfgraphmapinit (doubleprecision (*) doubleprecision (*) doubleprecision (*) integer*num (*) integer grafptr, mappptr, archptr, parttab) grafdat, mappdat, archdat, parttab, ierr) Description The SCOTCH graphMapInit routine fills the mapping structure pointed to by mappptr with all of the data that is passed to it.
7.6.5 SCOTCH graphMapLoad Synopsis int SCOTCH graphMapLoad (const SCOTCH Graph * SCOTCH Mapping * FILE * scotchfgraphmapload (doubleprecision (*) doubleprecision (*) integer integer grafptr, mappptr, stream) grafdat, mappdat, fildes, ierr) Description The SCOTCH graphMapLoad routine fills the SCOTCH Mapping structure pointed to by mappptr with the mapping data available in the Scotch mapping format (see section 5.5) from stream stream.
7.6.7 SCOTCH graphMapCompute Synopsis int SCOTCH graphMapCompute (const SCOTCH Graph * SCOTCH Mapping * const SCOTCH Strat * scotchfgraphmapcompute (doubleprecision (*) doubleprecision (*) doubleprecision (*) integer grafptr, mappptr, straptr) grafdat, mappdat, stradat, ierr) Description The SCOTCH graphMapCompute routine computes a mapping on the given SCOTCH Mapping structure pointed to by mappptr using the mapping strategy pointed to by stratptr.
Return values SCOTCH mapView returns 0 if the data has been successfully written to stream, and 1 else. 7.7 Graph ordering routines The first routine provides high-level functionality and frees the user from the burden of calling in sequence several of the low-level routines described afterward. 7.7.
vertnbr − 1 if the graph base is 0, and from 1 to vertnbr if the graph base is 1. The three other result fields, *cblkptr, rangtab and treetab, contain data related to the block structure. *cblkptr holds the number of column blocks of the produced ordering, and rangtab holds the starting indices of each of the permuted column blocks, in increasing order, so that column block i starts at index rangtab[i] and ends at index (rangtab[i+1]−1), inclusive, in the new ordering.
vertnbr + 1, and treetab is the array holding the structure of the separators tree, of size vertnbr. See the above manual page of SCOTCH graphOrder, as well as section 7.2.5, for an explanation of the semantics of all of these fields. The SCOTCH graphOrderInit routine should be the first function to be called upon a SCOTCH Ordering structure for ordering graphs. When the ordering structure is no longer of use, the SCOTCH graphOrderExit function must be called, in order to to free its internal structures.
Return values SCOTCH graphOrderLoad returns 0 if the ordering structure has been successfully loaded from stream, and 1 else. 7.7.
resulting mapping file can be used by the gout program (see Section 6.3.12) to produce pictures showing the different separators and blocks. Fortran users must use the PXFFILENO or FNUM functions to obtain the number of the Unix file descriptor fildes associated with the logical unit of the mapping file. Return values SCOTCH graphOrderSaveMap returns 0 if the ordering structure has been successfully written to stream, and 1 else. 7.7.
Description The SCOTCH graphOrderCheck routine checks the consistency of the given SCOTCH Ordering structure pointed to by ordeptr. Return values SCOTCH graphOrderCheck returns 0 if ordering data are consistent, and 1 else. 7.7.
Description The SCOTCH graphOrderComputeList routine computes a block ordering of a subgraph of the graph structure pointed to by grafptr, using the ordering strategy pointed to by stratptr, and stores its result in the ordering structure pointed to by ordeptr. The induced subgraph is described by means of a vertex list: listnbr holds the number of vertices to keep in the induced subgraph, the indices of which are given, in any order, in the listtab array.
Return values SCOTCH meshInit returns 0 if the mesh structure has been successfully initialized, and 1 else. 7.8.2 SCOTCH meshExit Synopsis void SCOTCH meshExit (SCOTCH Mesh * meshptr) scotchfmeshexit (doubleprecision (*) meshdat) Description The SCOTCH meshExit function frees the contents of a SCOTCH Mesh structure previously initialized by SCOTCH meshInit. All subsequent calls to SCOTCH mesh* routines other than SCOTCH meshInit, using this structure as parameter, may yield unpredictable results.
7.8.4 SCOTCH meshSave Synopsis int SCOTCH meshSave (const SCOTCH Mesh * FILE * scotchfmeshsave (doubleprecision (*) integer integer meshptr, stream) meshdat, fildes, ierr) Description The SCOTCH meshSave routine saves the contents of the SCOTCH Mesh structure pointed to by meshptr to stream stream, in the Scotch mesh format (see section 5.2).
scotchfmeshbuild (doubleprecision (*) integer*num integer*num integer*num integer*num integer*num (*) integer*num (*) integer*num (*) integer*num (*) integer*num (*) integer*num integer*num (*) integer*num meshdat, velmbas, vnodbas, velmnbr, vnodnbr, verttab, vendtab, velotab, vnlotab, vlbltab, edgenbr, edgetab, ierr) Description The SCOTCH meshBuild routine fills the source mesh structure pointed to by meshptr with all of the data that is passed to it.
stage, to call the SCOTCH meshCheck routine on the newly created SCOTCH Mesh structure, prior to any other calls to libScotch routines. Return values SCOTCH meshBuild returns 0 if the mesh structure has been successfully set with all of the input data, and 1 else. 7.8.
This routine is useful to get the size of a mesh read by means of the SCOTCH meshLoad routine, in order to allocate auxiliary arrays of proper sizes. If the whole structure of the mesh is wanted, function SCOTCH meshData should be preferred. 7.8.
the reference to the adjacency end index array, and is equal to verttab + 1 if the adjacency array is compact. velotab and vnlotab are pointers to locations that will hold the reference to the element and node vertex load arrays, of sizes *velmptr and *vnodptr, respectively. vlbltab is the pointer to a location that will hold the reference to the vertex label array, of size (*velmptr + *vnodptr). edgeptr is the pointer to a location that will hold the number of arcs (that is, twice the number of edges).
scotchfmeshstat (doubleprecision (*) integer*num integer*num integer*num doubleprecision doubleprecision integer*num integer*num doubleprecision doubleprecision integer*num integer*num doubleprecision doubleprecision meshdat, vnlomin, vnlomax, vnlosum, vnloavg, vnlodlt, edegmin, edegmax, edegavg, edegdlt, ndegmin, ndegmax, ndegavg, ndegdlt) Description The SCOTCH meshStat routine produces some statistics regarding the mesh structure pointed to by meshptr.
In order to save memory space as well as computation time, in the current implementation of SCOTCH meshGraph, some mesh arrays are shared with the graph structure. Therefore, one should make sure that the graph must no longer be used after the mesh structure is freed. The graph structure can be freed before or after the mesh structure, but must not be used after the mesh structure is freed.
On return, permtab holds the direct permutation of the unknowns, that is, node vertex i of the original mesh has index permtab[i] in the reordered mesh, while peritab holds the inverse permutation, that is, node vertex i in the reordered mesh had index peritab[i] in the original mesh.
The SCOTCH meshOrderInit routine fills the ordering structure pointed to by ordeptr with all of the data that are passed to it. Thus, all subsequent calls to ordering routines such as SCOTCH meshOrderCompute, using this ordering structure as parameter, will place ordering results in fields permtab, peritab, *cblkptr, rangtab or treetab, if they are not set to NULL.
Description The SCOTCH meshOrderSave routine saves the contents of the SCOTCH Ordering structure pointed to by ordeptr to stream stream, in the Scotch ordering format (see section 5.6). Return values SCOTCH meshOrderSave returns 0 if the ordering structure has been successfully written to stream, and 1 else. 7.9.
Description The SCOTCH meshOrderSaveTree routine saves the tree hierarchy information associated with the SCOTCH Ordering structure pointed to by ordeptr to stream stream. The format of the tree output file resembles the one of a mapping or ordering file: it is made up of as many lines as there are node vertices in the ordering. Each of these lines holds two integer numbers.
Description The SCOTCH meshOrderCompute routine computes a block ordering of the mesh structure pointed to by grafptr, using the mapping strategy pointed to by stratptr, and stores its result in the ordering structure pointed to by ordeptr. On return, the ordering structure holds a block ordering of the given mesh (see section 7.9.2 for a description of the ordering fields). Return values SCOTCH meshOrderCompute returns 0 if the ordering has been successfully computed, and 1 else.
7.10.3 SCOTCH stratSave Synopsis int SCOTCH stratSave (const SCOTCH Strat * FILE * scotchfstratsave (doubleprecision (*) integer integer straptr, stream) stradat, fildes, ierr) Description The SCOTCH stratSave routine saves the contents of the SCOTCH Strat structure pointed to by straptr to stream stream, in the form of a text string.
7.10.5 SCOTCH stratGraphMap Synopsis int SCOTCH stratGraphMap (SCOTCH Strat * const char * scotchfstratgraphmap (doubleprecision (*) character (*) integer straptr, string) stradat, string, ierr) Description The SCOTCH stratGraphMap routine fills the strategy structure pointed to by straptr with the graph mapping strategy string pointed to by string. From this point, the strategy structure can only be used as a mapping strategy, to be used by function SCOTCH graphMap, for instance.
7.10.7 SCOTCH stratGraphOrder Synopsis int SCOTCH stratGraphOrder (SCOTCH Strat * const char * scotchfstratgraphorder (doubleprecision (*) character (*) integer straptr, string) stradat, string, ierr) Description The SCOTCH stratGraphOrder routine fills the strategy structure pointed to by straptr with the graph ordering strategy string pointed to by string. From this point, the strategy structure can only be used as a graph ordering strategy, to be used by function SCOTCH graphOrder, for instance.
7.10.9 SCOTCH stratMeshOrder Synopsis int SCOTCH stratMeshOrder (SCOTCH Strat * const char * scotchfstratmeshorder (doubleprecision (*) character (*) integer straptr, string) stradat, string, ierr) Description The SCOTCH stratMeshOrder routine fills the strategy structure pointed to by straptr with the mesh ordering strategy string pointed to by string. From this point, strategy strat can only be used as a mesh ordering strategy, to be used by function SCOTCH meshOrder, for instance.
7.11 Geometry handling routines Since the Scotch project is based on algorithms that rely on topology data only, geometry data do not play an important role in the libScotch library. They are only relevant to programs that display graphs, such as the gout program. However, since all routines that are used by the programs of the Scotch distributions have an interface in the libScotch library, there exist geometry handling routines in it, which manipulate SCOTCH Geom structures.
The SCOTCH geomExit function frees the contents of a SCOTCH Geom structure previously initialized by SCOTCH geomInit. All subsequent calls to SCOTCH *Geom* routines other than SCOTCH geomInit, using this structure as parameter, may yield unpredictable results. 7.11.
7.11.
The SCOTCH graphGeomSaveChac routine saves the contents of the SCOTCH Graph structure pointed to by grafptr to stream grafstream, in the Chaco graph format [24]. Since this graph format does not handle geometry data, the geomptr and geomstream fields are not used, as well as the string field. Fortran users must use the PXFFILENO or FNUM functions to obtain the number of the Unix file descriptor graffildes associated with the logical unit of the graph file.
int SCOTCH graphGeomLoadScot (SCOTCH Graph * SCOTCH Geom * FILE * FILE * const char * scotchfgraphgeomloadscot (doubleprecision (*) doubleprecision (*) integer integer character (*) grafptr, geomptr, grafstream, geomstream, string) grafdat, geomdat, graffildes, geomfildes, string) Description The SCOTCH graphGeomLoadScot routine fills the SCOTCH Graph and SCOTCH Geom structures pointed to by grafptr and geomptr with the source graph description and geometry data available from streams grafstream and geom
Fortran users must use the PXFFILENO or FNUM functions to obtain the numbers of the Unix file descriptors graffildes and geomfildes associated with the logical units of the graph and geometry files. Return values SCOTCH graphGeomSaveScot returns 0 if the graph topology and geometry have been successfully written to grafstream and geomstream, and 1 else. 7.11.
scotchfmeshgeomloadscot (doubleprecision (*) doubleprecision (*) integer integer character (*) meshdat, geomdat, meshfildes, geomfildes, string) Description The SCOTCH meshGeomLoadScot routine fills the SCOTCH Mesh and SCOTCH Geom structures pointed to by meshptr and geomptr with the source mesh description and node geometry data available from streams meshstream and geomstream in the Scotch mesh and geometry formats (see sections 5.2 and 5.3, respectively). The string field is not used.
7.12 Error handling routines The handling of errors that occur within library routines is often difficult, because library routines should be able to issue error messages that help the application programmer to find the error, while being compatible with the way the application handles its own errors.
The SCOTCH errorProg function is designed to be called at the beginning of a program or of a portion of code to identify the place where subsequent errors take place. This routine is not reentrant, as it is only a minor help function. It is defined in libscotcherr.a and is used by the standalone programs of the Scotch distribution. 7.13 Miscellaneous routines 7.13.
metis edgend (integer integer integer integer integer integer integer (*) (*) (*) (*) (*) n, xadj, adjncy, numflag, options, perm, iperm) Description The METIS EdgeND function performs a nested dissection ordering of the graph passed as arrays xadj and adjncy, using the default Scotch ordering strategy. The options array is not used.
While Scotch has also both node and edge separation capabilities, all of the three MeTiS stubs METIS EdgeND, METIS NodeND and METIS NodeWND call the same Scotch routine, which uses the Scotch default ordering strategy proved to be efficient in most cases. 7.14.
void METIS PartGraphKway (const const const const const const const const const int * int * metis partgraphkway (integer integer integer integer integer integer integer integer integer integer integer int * int * int * int * int * int * int * int * int * const const (*) (*) (*) (*) (*) (*) const const const const const const const const const n, xadj, adjncy, vwgt, adjwgt, wgtflag, numflag, nparts, options, edgecut, part) n, xadj, adjncy, vwgt, adjwgt, wgtflag, numflag, nparts, options, edgecut, part)
metis partgraphrecursive (integer integer integer integer integer integer integer integer integer integer integer (*) (*) (*) (*) (*) (*) n, xadj, adjncy, vwgt, adjwgt, wgtflag, numflag, nparts, options, edgecut, part) Description The METIS PartGraphRecursive function performs a mapping onto the complete graph of the graph represented by arrays xadj, adjncy, vwgt and adjwgt, using the default Scotch mapping strategy. The options array is not used.
Description The METIS PartGraphVKway function performs a mapping onto the complete graph of the graph represented by arrays xadj, adjncy, vwgt and vsize, using the default Scotch mapping strategy. The options array is not used. The part array has the same meaning as the parttab array of Scotch.
COMMON FILE COMPRESS LZMA for lzma decompression. Note that the corresponding development libraries must be installed on your system before compile time, and that compressed file handling can take place only on systems which support multi-threading or multi-processing. In the first case, you must set the SCOTCH PTHREAD flag in order to take advantage of these features.
This can also be done in a single piped command: % echo cmplt 7 | gmap brol.grf - /tmp/brol.map If compressed data handling is enabled, read the graph as a gzip compressed file, and output the mapping as a bzip2 file, on the fly: % echo cmplt 7 | gmap brol.grf.gz - /tmp/brol.map.bz2 • Partition source graph brol.grf into two uneven parts of respective weights 4 7 11 and 11 , and save the result to file /tmp/brol.map. % echo cmpltw 2 4 7 > /tmp/k2w.tgt % gmap brol.grf /tmp/k2w.tgt /tmp/brol.
To speed up target architecture loading in the future, the decompositiondefined target architecture is compiled by means of acpl. • Build an architecture graph which is the subgraph of the 8-node de Bruijn graph restricted to vertices labeled 1, 2, 4, 5, 6, map graph graph.grf onto it, and save the result to file /tmp/brol.map. % (gmk ub2 3; echo 5 1 2 4 5 6) | amk grf -L | gmap graph.grf /tmp/brol.
simple mesh, in the form of a bipartite graph, the definition of which is given in file mesh.h, respectively. From this structure are derived enriched graph and mesh structures: • Bgraph, in file bgraph.h: graph with bipartition, that is, edge separation, information attached to it; • Kgraph, in file kgraph.h: graph with mapping information attached to it; • Hgraph, in file hgraph.h: graph with halo information attached to it, for computing graph orderings; • Vgraph, in file vgraph.
1. Write the code of the method itself. First, choose a free two-letter code to describe your method, say “xy”. In the libscotch source directory, create files vgraph separate xy.c and vgraph separate xy.h, basing on existing files such as vgraph separate gg.c and vgraph separate gg.h, for instance. If the method is complex, it can be split across several other files, which will be named vgraph separate xy firstmodulename.c, vgraph separate xy secondmodulename.c, eventually with matching header files.
• STRATPARAMSTRAT: strategy. For instance, the graph ordering method by nested dissection takes a vertex partitioning strategy as one of its parameters, to compute the vertex separators. The fourth and fifth fields are the address of the location of the default structure and the address of the parameter within this default structure, respectively.
• Alex Pothen kindly gave me a version of his Multiple Minimum Degree algorithm, which was embedded into Scotch from version 3.2 to version 3.4; • Luca Scarano, visiting Erasmus student from the Universitá degli Studi di Bologna, coded the multi-level graph algorithm in Scotch 3.1; • Yves Secretan contributed to the MinGW32 port; • David Sherman proofread version 3.2 of this manual. References [1] P. Amestoy, T. Davis, and I. Duff. An approximate minimum degree ordering algorithm. SIAM J. Matrix Anal.
[13] M. R. Garey and D. S. Johnson. Computers and Intractablility: A Guide to the Theory of NP-completeness. W. H. Freeman, San Francisco, 1979. [14] G. A. Geist and E. G.-Y. Ng. Task scheduling for parallel sparse Cholesky factorization. International Journal of Parallel Programming, 18(4):291–314, 1989. [15] A. George, M. T. Heath, J. W.-H. Liu, and E. G.-Y. Ng. Sparse Cholesky factorization on a local memory multiprocessor. SIAM Journal on Scientific and Statistical Computing, 9:327–340, 1988. [16] A.
[29] P. Hénon, F. Pellegrini, P. Ramet, J. Roman, and Y. Saad. High performance complete and incomplete factorizations for very large sparse systems by using scotch and pastix softwares. In Proc. 11th SIAM Conference on Parallel Processing for Scientific Computing, San Francisco, USA, February 2004. [30] J. Hopcroft and R. Karp. An n5/2 algorithm for maximum matchings in bipartite graphs. SIAM Journal of Computing, 2(4):225–231, December 1973. [31] G. Karypis and V. Kumar.
[44] F. Pellegrini and J. Roman. Experimental analysis of the dual recursive bipartitioning algorithm for static mapping. Research Report, LaBRI, Université Bordeaux I, August 1996. Available from http://www.labri.fr/~pelegrin/ papers/scotch_expanalysis.ps.gz. [45] F. Pellegrini and J. Roman. Scotch: A software package for static mapping by dual recursive bipartitioning of process and architecture graphs. In Proc. HPCN’96, Brussels, LNCS 1067, pages 493–498, April 1996. [46] F. Pellegrini and J. Roman.