User's guide Mondriaan version 3 (July 27, 2010)

Installing
Compiling
As a library
Running
Output
MATLAB
Options
More...

This page is continuously being improved and updated; therefore, a more recent version may be obtained online. This offline version is bundled with the software for your convenience.


How to download and install Mondriaan

Download the latest version from the Mondriaan software homepage. Uncompress with, e.g.,

This will create a directory Mondriaan3 which contains all the files of the Mondriaan package.

Important files are:

How to compile and test Mondriaan

Go inside the directory Mondriaan3 and type

Here '%' is the prompt of your operating system. This will compile the Mondriaan library, tools included with the Mondriaan library, and all 96 unit tests of the library with the default options as given in mondriaan.mk. After compilation the include files are located in Mondriaan3/src/include, the compiled library in Mondriaan3/src/lib, and the stand-alone Mondriaan tools in Mondriaan3/tools. An example of how to use the Mondriaan library in your own program is provided by Mondriaan3/docs/example.c and Mondriaan3/tools/Mondriaan.c.

For more information about MATLAB usage, please see the Mondriaan MATLAB guide. To enable MATLAB support for Mondriaan, edit the file mondriaan.mk and change the line containing the variable MATLABHOMEDIR to your current MATLAB install directory and remove the '%' in front of the line to uncomment it. The variable MEXSUFFIX should be set to the appropriate extension for your platform, as specified on the Mathworks site. This should be done before compiling Mondriaan.

Similarly, to enable PaToH support (the PaToH library can be found here), edit mondriaan.mk to change the line containing the PATOHHOMEDIR variable, as detailed above.

To compile and run all 96 unit tests, type

For each function of Mondriaan, we have written a small (undocumented) test, which will check the function for proper behaviour. The unit tests have been tried successfully on three architectures: Linux, Mac Os X, Solaris. If one or more of the tests fails, please try to identify the relevant error message, and inform me (R . H . Bisseling @ NOSPAM uu . nl ) about the problem. I'll try to help you solve it. The tests are in a script runtest in the tests/ subdirectory.

Timers

The compile option -DTIME causes the CPU time used by Mondriaan for the matrix distribution to be printed, and also the time for the vector distribution. This timer has an accuracy of at least 0.01 seconds; it is in danger of clock wraparound. The compile option -DUNIX tells Mondriaan you are using a UNIX system. Together with -DTIME this also causes the elapsed (wall clock) time to be printed, which is an upper bound on the CPU time. Usually the resolution is higher, and there is no danger of clock wraparound. This is the best timer if you are the single user of the system.

Levels of verbosity

Mondriaan writes the output distributions to file. It can also generate useful statistics about the partitioning to the standard output stream stdout (or the screen). There are three levels of verbosity: silent, standard, and verbose. The silent mode is useful when running the unit tests by make test. (If these are not done silently, the OKs are obscured by clutter.) The silent mode may also be useful if (parts of) Mondriaan are used as library functions. The verbose mode is useful when debugging, or trying to understand a particular run in detail. The standard mode generates 1-2 pages of output, and is aimed at easy digestion. You can change the verbosity level by commenting and uncommenting the appropriate CFLAGS lines in mondriaan.mk. Using the flag -DINFO generates standard output, and using the flags -DINFO -DINFO2 generates verbose output.

How to use Mondriaan as a library

After successful compilation (using make), all relevant header files are exported to the src/include directory, and a static library is available in the src/lib directory.

To illustrate the use of Mondriaan we provide a small example program together with this guide. This example can be compiled (provided the Mondriaan library was succesfully built) by executing

in the docs/ directory which will generate the executable a.out. More advanced use of the library is illustrated in tools/Mondriaan.c.

Mondriaan uses a triplet-based datastructure (the Matrix Market format) to store sparse matrices; that is, for each nonzero, its row- and column-index are stored, as well as its numerical value (in general). The exact representation is detailed in src/SparseMatrix.c/.h; and to successfully interface, your sparse matrix scheme has to be translated into this format.

Since the Compressed Row Storage (CRS), or alternatively known as Compressed Sparse Row (CSR), is a more prevailing standard, SparseMatrix comes with a translation function specifically for that datastructure: CRSSparseMatrixInit. It takes as input an uninitialised Mondriaan SparseMatrix struct, the matrix dimensions and number of nonzeroes, and the CRS datastructure arrays. The base parameter automatically translates from, e.g., 1-based arrays (base=1) as they may appear in for example Fortran, to 0-based arrays as used in Mondriaan.

Next is setting any options for Mondriaan. Recommended is to use an options file, storing all the defaults tuned to your application (e.g. as provided by tools/Mondriaan.defaults). These defaults can be read from file and stored in the Mondriaan Options struct by using the function SetOptionsFromFile; see src/Options.c/.h.

Once both the sparse matrix and the options are ready the main Mondriaan distribution function, DistributeMatrixMondriaan, can be used. This function takes as parameters the sparse matrix struct, the number of processors, the load imbalance parameter, and the options struct. The callback function is for advanced use, and is usually kept a NULL pointer (an example of using the callback is given in tools/MondriaanPlot.c). This function is a blocking call. After partitioning, all relevant information can be extracted from the SparseMatrix struct (see the source code for details). The struct can be freed by calling MMDeleteSparseMatrix.

Interfacing with non-C code is best achieved by writing a custom interface between your code and Mondriaan, using extern "C"-style additions when, e.g., interfacing between C++ and C; or using any other specific translation required (e.g., writing all parameters as pointers (Fortran), using jni (Java), etc.). As an example, the MATLAB interface is given in tools/MatlabMondriaan.c.

How to run Mondriaan

Go inside the directory Mondriaan3 and type

if you want to partition the arc130.mtx matrix (Matrix Market file format) for 8 processors with at most 3% load imbalance. The matrix should be the full relative path; in the above example output is saved in the Mondriaan tests folder.
Mondriaan may also be used to partition matrices provided via the standard input, by using

to obtain the same result as with the previous method, with one difference: results are written in the current (tools) directory. For integration with already existing software you may have, without resorting to writing and reading files (or pipes), look at the how to use Mondriaan as a library section of this file.

Output

The main Mondriaan tool yields, after a succesful run on an input matrix, various output files. All possible output files are described below. Typically, the output filenames are that of the input matrix filename, appended with a small descriptor and usually the number of parts x Mondriaan was requested to construct.

Distributed matrix (-Px)

The Mondriaan program writes the distributed matrix to a file called input-Px, where input is the name of the input matrix, or stdin if the matrix was read from the standard input, and x is the number of processors used in the distribution. We use an adapted Matrix Market format, with this structure:
%%MatrixMarket distributed-matrix coordinate real general
m n nnz P
Pstart[0]
( this should be 0 )
...
...
...
Pstart[P]( this should be nnz )
A.i[0] A.j[0] A.value[0] ...
...
...
A.i[nnz-1] A.j[nnz-1] A.value[nnz-1]

Here, Pstart[k] points to the start of the nonzeroes of processor k.

Processor indices (-Ix)

The Mondriaan program also writes the processor indices of each nonzero to the Matrix Market file input-Ix where the value of each nonzero is replaced by the processor index to which the nonzero has been assigned. The order of the nonzeroes is exactly that of the distributed matrix (-Px).

Row and column permutations (-rowx, -colx)

Mondriaan writes the row and column permutations determined by the Mondriaan algorithm (set by the Permute option) to input-rowx and input-colx. The goal of these permutations is to bring the input matrix A into (doubly Separated Block Diagonal (SBD) or Bordered Block Diagonal (BBD) form, after applying the found permutations. This can have many possible applications, including, but not limited to, cache-oblivious sparse matrix vector multiplication (SBD form) or minimising fill-in in sparse LU decomposition (BBD form). These files are not written if the Permute option is set to none.

Reordered matrix (-reor-Px)

Mondriaan also directly writes the permuted matrix PAQ to file, where the permutation matrix P corresponds to the row permutation determined by the Mondriaan algorithm, as described in the previous paragraph. The permutation matrix Q is similarly inferred from the column permutation. This file is not written if the Permute option is set to none.

Separator boundary indices (-rowblocksx, -colblocksx)

Figure 1
Figure 2

The (doubly) separated block diagonal and bordered block diagonal forms each identify groups of nonzeroes, the separators, which serve as connectors between two parts resulting from a single bipartition. In the reordered matrix described above, these groups appear as relatively dense and usually small strips of consecutive rows and columns; the separator blocks. The indices where these blocks start and end in the reordered matrix are given in these two files. These files are not written if the Permute option is set to none.

A more detailed explanation follows from the (idealised) Figures 1 and 2. The first shows a reordering corresponding to a single bipartition, clearly yielding four row and column indices indicating the start and end of the differently coloured partitions. This can occur recursively as shown in the second figure; there, each non-red partition is an instance of Figure 1, resulting in sixteen row and column indices indicating the separators for the eight partitions.

Input/output vector distributions (-ux, -vx)

The program writes the processor numbers of the vector components to the files called input-ux and input-vx, where input is the name of the input matrix and x is the number of processors used in the distribution. The vectors u and v are the output and input vectors of the sparse matrix-vector multiplication u=A*v.

In I/O files, all indices (i,j) for matrix entries a(i,j) and vector components u(i) and v(j) start numbering from 1, following the Matrix Market conventions. In I/O, the processors are numbered 1 to P. Internally, the indices are converted to the standard C-numbering starting from 0.

Cartesian submatrices (-Cx)

The program writes the row index sets I(q) and column index sets J(q) of the Cartesian submatrix I(q) x J(q) for the processors q=1,...,P to the file called input-Cx, where input is the name of the input matrix and x is the number of processors used in the distribution. This file is additional information, useful e.g. for visualisation, and you may not need it.

Statistics on standard output

Provided the library is compiled with the -DINFO or -DINFO2 option, the program prints plenty of useful statistics to standard output. The communication volume is given for the two phases of the matrix-vector multiplication separately: the volume for v (first communication phase) and the volume for u (second communication phase). The bottom line is the communication cost which is defined as the sum of the costs of the two phases. The cost of a phase is the maximum of the number of data words sent and the number of data words received, over all processors. This metric is also called the BSP cost; it is the cost metric of the Bulk Synchronous Parallel model.

Graphical output

In the tools/ subdirectory the program MondriaanPlot can be used to get insight in the Mondriaan partitioning algorithm. This program creates a series of Truevision TGA image files named img0000.tga, img0001.tga, ..., up to the number of processors over which the matrix is divided. To use this program, issue

If you have mencoder installed, you can use the MondriaanMovie script to generate a small movie of the partitioning process:

This will create ../tests/arc130.mtx.avi. If you have imagemagick installed (or if the convert command is otherwise available), you can use MondriaanGIF to create an animated GIF of the partitioning process:

This will create ../tests/arc130.mtx.gif.

Using Mondriaan in MATLAB

For more information about MATLAB usage, please see the Mondriaan MATLAB guide.

Program options

The Mondriaan options can be set in the Mondriaan.defaults file. If no such file exists, it is created at run time. Afterward, this file can be edited for further runs. The default values of the options are given below in boldface. It is possible to overrule the defaults from the command line, e.g. by typing

you can force Mondriaan to split the matrix in one dimension only, namely by rows.

Nonnumerical options

The nonnumerical options are used to choose partitioning methods. You may need to change them from the defaults to explore different partitioning methods.

Numerical options

The numerical options are often used to optimise given partitioning methods. Setting values such as NrRestarts, MaxNrLoops, or MaxNrNoGainMoves to a higher number, often results in better quality of the partitioning solution, at the expense of increased run-time.

Profiling Mondriaan

To profile the Mondriaan library for various configurations, please use the Profile program and RunProfiler script both included in the tools/ directory. Entering the tools/ directory and executing

partitions the matrices a.mtx, b.mtx, and c.mtx among 8 processors with at most 3% imbalance, taking the average over 10 runs with a different random seed. The results are written, via stdout, to the LaTeX file out.tex which then contains a table with the averaged results of the partitioning process.

For code developers

If you develop new code to add to Mondriaan, you probably would like to add possibilities to use your code through a new option. In that case you need to adjust a few files:

More information

All functions of Mondriaan have extensive documentation in the source code (.c files). Please have a look there for more details.


Last updated July 27, 2010 by Bas Fagginger Auer,
June 30, 2010 by Albert-Jan Yzelman,
May 27, 2010 by Bas Fagginger Auer,
June 9, 2009 by Rob Bisseling.
To the Mondriaan package home page.

Valid HTML 4.01 Strict Valid CSS!