## Mondriaan and MATLAB

This guide is a step-by-step introduction to using Mondriaan together with MATLAB. For more extensive information about Mondriaan, please take a look at the user's guide.

### How to download and install Mondriaan

Download the latest version from the Mondriaan software homepage. Uncompress with

`% tar xzvf mondriaan4.tar.gz`

This will create a directory `Mondriaan4`

which contains all the files of the Mondriaan package.
To enable MATLAB support, open the file `Mondriaan4/mondriaan.mk`

with a text-editor and look for a line which looks similar to

`#MATLABHOMEDIR := /usr/local/matlab`

Change the directory on the right-hand side to your installation
directory of MATLAB and remove the `#`

in front of the line,
such that it looks similar to

`MATLABHOMEDIR := /your/matlab/installation/directory`

Furthermore make sure that the variable `MEXSUFFIX`

is set to the proper
extension for MATLAB binary files for your system (from the Mathworks site):

Platform | `MEXSUFFIX` |

Linux (32-bit) | `mexglx` |

Linux (64-bit) | `mexa64` |

Apple Macintosh (32-bit) | `mexmaci` |

Apple Macintosh (64-bit) | `mexmaci64` |

Microsoft Windows (32-bit) | `mexw32` |

Microsoft Windows (64-bit) | `mexw64` |

For example: on a 32-bit Macintosh system we would have `MEXSUFFIX := mexmaci`

.

Now we are ready to compile Mondriaan, run

`% make`

which will build Mondriaan and the associated tools.

### A small example

In this example we will partition a small test matrix using the MATLAB interface of Mondriaan.

As test matrix we can use tbdmatlab.mtx.gz
from the Mondriaan website. The archive should be extracted to the `Mondriaan4/tools`

directory.

Start MATLAB and navigate to the `Mondriaan4/tools`

directory in the *Current Directory*
subwindow.
To read and view `tbdmatlab.mtx`

, issue

`A = mmread('tbdmatlab.mtx');`

`spy(A)`

We can partition the matrix `A`

among 30 processors with a maximum imbalance of 3% by using
the `mondriaan`

function in MATLAB

`[I, s] = mondriaan(A, 30, 0.03);`

where `I`

is the same matrix as `A`

, only with the real values
of all the matrix nonzeroes set to the index of the processor to which
the nonzero was assigned, and `s`

contains partitioning information.
Full output can be generated with

`[I, s, p, q, r, c, rh, ch, B, u, v] = mondriaan(A, 30, 0.03, 2);`

where the last parameter (`2`

) is the desired permutation method (see below).
Here, p and q are permutation vectors, r and c are row-boundaries and column-boundaries
corresponding to the ordering's block structure, rh and ch store the separator hierarchy
information, the matrix B stores the reordered matrix PAQ (in MATLAB terminology:
B=A(p,q)) and finally u and v contain the indices of the processors to which the vector
components are assigned (for parallel multiplication of *u = A*v*).
See the User's Guide for full details on these output
vectors and matrices. For particulars on the boundary and hierarchy functions, jump to
the appropriate section here.

Value | Ordering |

0 | None (default value) |

1 | reverse BBD (reverse Bordered Block Diagonal) |

2 | SBD (Separated Block Diagonal) |

3 | BBD (Bordered Block Diagonal) |

### Exploiting symmetry

The MATLAB interface additionally has an option to make use of any symmetry properties of the input matrix A. This is done by setting a fifth parameter, such that the full call becomes:

`[I, s, p, q, r, c, rh, ch, B, u, v] = mondriaan(A, 30, 0.03, 2, symm);`

where symm is 0 by default (if the parameter is not given), and indicates A is
not symmetric. If `symm`

takes a value 1 or 2, A is assumed symmetric
and *only the lower triangular part of A is passed through to Mondriaan*. This
is exactly the same as using the regular (terminal-based) Mondriaan application on a
symmetric matrix with the options SymmetricMatrix_UseSingleEntry set to **yes** and
SymmetricMatrix_SingleEntryType set to **lower**. If these options are not set in
the Mondriaan.defaults file, they will be forced. The matrices I and B will still
correspond to the full matrix A. Any SplitStrategy can still be used, and is
taken as usual from the Mondriaan.defaults file. Recommended is to use the finegrain
or symmetric finegrain strategies. Others will work, but may not minimise the
communication volume during parallel sparse matrix-vector multiplication when
considering the full matrix A.

Setting `symm`

to 2 will indicate the matrix is structurally symmetric,
but as said before, still only the lower triangular part of A is passed through to
Mondriaan. This makes no difference for any of the output parameters, except for B,
which would, for `symm`

=1, return an incorrect full matrix PAQ as the full
reordered matrix is inferred only from the lower triangular part. Setting `symm`

to 2 prevents this by automatically postprocessing B by rebuilding PAQ using the
output parameters p and q.

Note that setting `symm`

equal to 1 or 2 yields symmetric permutations
(B=PAP^{T}). Also note that it is not checked whether the input matrix really
is symmetric, and as such unsymmetric matrices can also be passed through this method.
This probably does not yield any meaningful results.

### Example uses

We present two small examples of using Matlab in conjunction with Mondriaan; the
first will be on speeding up the sequential sparse matrix-vector multiply, the
second will illustrate speeding up the sequential sparse LU decomposition.
Assumed is that the working directory is `Mondriaan4/tools`

. Also available
should be:

- the tbdlinux matrix (available through Rob Bisseling's webpage),
- the west0497 matrix (available through the Florida Sparse Matrix Collection).

In both examples, the experiment is executed 1000 times to limit the effects of system jitter.

##### 1 (cache-oblivious SpMV [1], [2]):

```
>> A=mmread('tbdlinux.mtx');
```

>> [I, s, p, q, r, c, rh, ch, B, u, v] = mondriaan(A,50,0.1,2);

>> x=rand(size(A,2),1);

>> tic, for i=1:1000 A*x; end, toc

Elapsed time is 24.707203 seconds.

>> tic, z=x(q), for i=1:1000 B*z; end, toc

Elapsed time is 19.786526 seconds.

Using Mondriaan to transform the tbdlinux matrix into SBD form thus yields a modest 20 percent speed increase. This is expected to be higher for matrices for which the input and output vectors no longer fit into the caches closer to main memory. This method of reordering for sparse matrix-vector multiplication also yields much better results when used with optimised datastructures, such as OSKI, Incremental CRS, or dedicated block-wise structures; see [2] for details.

##### 2 (reducing fill-in during LU decomposition [3], [4]):

```
>> A=mmread('west0497.mtx');
```

>> [I, s, p, q, r, c, rh, ch, B, u, v] = mondriaan(A,10,0.1,3);

>> tic, for i=1:1000 [L,U,lu_P] = lu(A); end, toc

Elapsed time is 3.659008 seconds.

>> nnz(L+U)

ans =

13818

>> tic, for i=1:1000 [L,U,lu_P] = lu(B); end, toc

Elapsed time is 1.943670 seconds.

>> nnz(L+U)

ans =

4647

Here the use of Mondriaan with BBD ordering lets the stock MATLAB
LU algorithm run almost a factor 2 faster, and reduces the fill-in
with almost a factor 3. Note that this is not the UMFPACK version of
the LU algorithm, which employs its own reordering techniques
(amongst others); see `help lu`

within MATLAB.

### Visualisation

We can also directly visualise the partitioning process by using `mondriaanplot`

in the following fashion:

`mondriaanplot(A, 30, 0.03, 2);`

This concludes this small tutorial.
More information is available through issuing `help mondriaan`

from within MATLAB.

### MondriaanOpt

Apart from Mondriaan itself, also MondriaanOpt is available in MATLAB through the MatlabMondriaanOpt MEX routine. Example matlab functions are given in mondriaanOpt.m and mondriaanOptPlot.m. The interface of mondriaanOpt is as follows:

`[I, s] = mondriaanOpt(A, Imbalance, Volume)`

Here, `A`

is the sparse matrix to be partitioned, `Imbalance`

is the maximum allowed load imbalance,
`Volume`

is the initial upper bound on the volume, `I`

contains the partitioning information and
`s`

contains statistics about the run.
For more information, type `help mondriaanOpt`

or `help mondriaanOptPlot`

in MATLAB.

### References

[1]
*Cache-oblivious sparse matrix-vector multiplication by using sparse matrix partitioning methods*,
A. N. Yzelman and Rob H. Bisseling, SIAM Journal of Scientific Computation, Vol. 31, Issue 4, pp. 3128-3154 (2009).

[2]
*Two-dimensional cache-oblivious sparse matrix-vector multiplication*,
A. N. Yzelman and Rob H. Bisseling, Parallel Computing, Vol. 37, Issue 12, pp. 806-819 (2011).

[3]
*Hypergraph-partitioning-based sparse matrix ordering*,
Ümit V. Çatalyürek and C. Aykanat, Second International Workshop on Combinatorial Scientic Computing, CERFACS, 2005.

[4]
*Hypergraph-based Unsymmetric Nested Dissection Ordering for Sparse LU Factorization*,
L. Grigori, E. G. Boman, S. Donfack, and T. A. Davis, SIAM Journal of Scientific Computation,
Vol. 32, Issue 6, pp. 3426-3446 (2010).

Last updated: November 3, 2016.

July 27, 2010 by Bas Fagginger Auer,

December 10, 2010 by A. N. Yzelman,

March 27, 2012 by Bas Fagginger Auer,

August 29, 2013 by Rob Bisseling and Bas Fagginger Auer.

November 3, 2016 by Marco van Oort.

To
Home page Mondriaan package.