Mondriaan and MATLAB


This guide is a step-by-step introduction to using Mondriaan together with MATLAB. For more extensive information about Mondriaan, please take a look at the user's guide.


Known issues

Unfortunately, the Matlab interface for Mondriaan does not work with the most recent Matlab versions any more. We do not envision to repair this in the near future. However, volunteers for this task are always welcome!

How to download and install Mondriaan

Download the latest version from the Mondriaan software homepage. Uncompress with

This will create a directory Mondriaan4 which contains all the files of the Mondriaan package. To enable MATLAB support, open the file Mondriaan4/mondriaan.mk with a text-editor and look for a line which looks similar to

Change the directory on the right-hand side to your installation directory of MATLAB and remove the # in front of the line, such that it looks similar to

Furthermore make sure that the variable MEXSUFFIX is set to the proper extension for MATLAB binary files for your system (from the Mathworks site):

PlatformMEXSUFFIX
Linux (32-bit)mexglx
Linux (64-bit)mexa64
Apple Macintosh (32-bit)mexmaci
Apple Macintosh (64-bit)mexmaci64
Microsoft Windows (32-bit)mexw32
Microsoft Windows (64-bit)mexw64

For example: on a 32-bit Macintosh system we would have MEXSUFFIX := mexmaci.

Now we are ready to compile Mondriaan, run

which will build Mondriaan and the associated tools.

A small example

In this example we will partition a small test matrix using the MATLAB interface of Mondriaan.

As test matrix we can use tbdmatlab.mtx.gz from the Mondriaan website. The archive should be extracted to the Mondriaan4/tools directory.

Start MATLAB and navigate to the Mondriaan4/tools directory in the Current Directory subwindow. To read and view tbdmatlab.mtx, issue

We can partition the matrix A among 30 processors with a maximum imbalance of 3% by using the mondriaan function in MATLAB

where I is the same matrix as A, only with the real values of all the matrix nonzeroes set to the index of the processor to which the nonzero was assigned, and s contains partitioning information. Full output can be generated with

where the last parameter (2) is the desired permutation method (see below). Here, p and q are permutation vectors, r and c are row-boundaries and column-boundaries corresponding to the ordering's block structure, rh and ch store the separator hierarchy information, the matrix B stores the reordered matrix PAQ (in MATLAB terminology: B=A(p,q)) and finally u and v contain the indices of the processors to which the vector components are assigned (for parallel multiplication of u = A*v). See the User's Guide for full details on these output vectors and matrices. For particulars on the boundary and hierarchy functions, jump to the appropriate section here.

ValueOrdering
0None (default value)
1reverse BBD (reverse Bordered Block Diagonal)
2SBD (Separated Block Diagonal)
3BBD (Bordered Block Diagonal)

Exploiting symmetry

The MATLAB interface additionally has an option to make use of any symmetry properties of the input matrix A. This is done by setting a fifth parameter, such that the full call becomes:

where symm is 0 by default (if the parameter is not given), and indicates A is not symmetric. If symm takes a value 1 or 2, A is assumed symmetric and only the lower triangular part of A is passed through to Mondriaan. This is exactly the same as using the regular (terminal-based) Mondriaan application on a symmetric matrix with the options SymmetricMatrix_UseSingleEntry set to yes and SymmetricMatrix_SingleEntryType set to lower. If these options are not set in the Mondriaan.defaults file, they will be forced. The matrices I and B will still correspond to the full matrix A. Any SplitStrategy can still be used, and is taken as usual from the Mondriaan.defaults file. Recommended is to use the finegrain or symmetric finegrain strategies. Others will work, but may not minimise the communication volume during parallel sparse matrix-vector multiplication when considering the full matrix A.

Setting symm to 2 will indicate the matrix is structurally symmetric, but as said before, still only the lower triangular part of A is passed through to Mondriaan. This makes no difference for any of the output parameters, except for B, which would, for symm=1, return an incorrect full matrix PAQ as the full reordered matrix is inferred only from the lower triangular part. Setting symm to 2 prevents this by automatically postprocessing B by rebuilding PAQ using the output parameters p and q.

Note that setting symm equal to 1 or 2 yields symmetric permutations (B=PAPT). Also note that it is not checked whether the input matrix really is symmetric, and as such unsymmetric matrices can also be passed through this method. This probably does not yield any meaningful results.

Example uses

We present two small examples of using Matlab in conjunction with Mondriaan; the first will be on speeding up the sequential sparse matrix-vector multiply, the second will illustrate speeding up the sequential sparse LU decomposition. Assumed is that the working directory is Mondriaan4/tools. Also available should be:

In both examples, the experiment is executed 1000 times to limit the effects of system jitter.

1 (cache-oblivious SpMV [1], [2]):

>> A=mmread('tbdlinux.mtx');
>> [I, s, p, q, r, c, rh, ch, B, u, v] = mondriaan(A,50,0.1,2);
>> x=rand(size(A,2),1);
>> tic, for i=1:1000 A*x; end, toc
Elapsed time is 24.707203 seconds.
>> tic, z=x(q), for i=1:1000 B*z; end, toc
Elapsed time is 19.786526 seconds.

Using Mondriaan to transform the tbdlinux matrix into SBD form thus yields a modest 20 percent speed increase. This is expected to be higher for matrices for which the input and output vectors no longer fit into the caches closer to main memory. This method of reordering for sparse matrix-vector multiplication also yields much better results when used with optimised datastructures, such as OSKI, Incremental CRS, or dedicated block-wise structures; see [2] for details.

2 (reducing fill-in during LU decomposition [3], [4]):

>> A=mmread('west0497.mtx');
>> [I, s, p, q, r, c, rh, ch, B, u, v] = mondriaan(A,10,0.1,3);
>> tic, for i=1:1000 [L,U,lu_P] = lu(A); end, toc
Elapsed time is 3.659008 seconds.
>> nnz(L+U)

ans =

       13818

>> tic, for i=1:1000 [L,U,lu_P] = lu(B); end, toc
Elapsed time is 1.943670 seconds.
>> nnz(L+U)

ans =

        4647

Here the use of Mondriaan with BBD ordering lets the stock MATLAB LU algorithm run almost a factor 2 faster, and reduces the fill-in with almost a factor 3. Note that this is not the UMFPACK version of the LU algorithm, which employs its own reordering techniques (amongst others); see help lu within MATLAB.

Visualisation

We can also directly visualise the partitioning process by using mondriaanplot in the following fashion:

This concludes this small tutorial. More information is available through issuing help mondriaan from within MATLAB.

MondriaanOpt

Apart from Mondriaan itself, also MondriaanOpt is available in MATLAB through the MatlabMondriaanOpt MEX routine. Example matlab functions are given in mondriaanOpt.m and mondriaanOptPlot.m. The interface of mondriaanOpt is as follows:

Here, A is the sparse matrix to be partitioned, Imbalance is the maximum allowed load imbalance, Volume is the initial upper bound on the volume, I contains the partitioning information and s contains statistics about the run. For more information, type help mondriaanOpt or help mondriaanOptPlot in MATLAB.

References

[1] Cache-oblivious sparse matrix-vector multiplication by using sparse matrix partitioning methods, A. N. Yzelman and Rob H. Bisseling, SIAM Journal of Scientific Computation, Vol. 31, Issue 4, pp. 3128-3154 (2009).
[2] Two-dimensional cache-oblivious sparse matrix-vector multiplication, A. N. Yzelman and Rob H. Bisseling, Parallel Computing, Vol. 37, Issue 12, pp. 806-819 (2011).
[3] Hypergraph-partitioning-based sparse matrix ordering, Ümit V. Çatalyürek and C. Aykanat, Second International Workshop on Combinatorial Scientic Computing, CERFACS, 2005.
[4] Hypergraph-based Unsymmetric Nested Dissection Ordering for Sparse LU Factorization, L. Grigori, E. G. Boman, S. Donfack, and T. A. Davis, SIAM Journal of Scientific Computation, Vol. 32, Issue 6, pp. 3426-3446 (2010).


Last updated: August 8, 2019.

July 27, 2010 by Bas Fagginger Auer,
December 10, 2010 by A. N. Yzelman,
March 27, 2012 by Bas Fagginger Auer,
August 29, 2013 by Rob Bisseling and Bas Fagginger Auer,
November 3, 2016 by Marco van Oort,
September 7, 2017 by Marco van Oort.

To Home page Mondriaan package.

Valid HTML 4.01 Strict Valid CSS!