Parallel Algorithms (WISM 459), 2022/2023


Teacher

Rob Bisseling.

Teaching assistant

Constantijn Dekker, email: tijndekker12@gmail.com

Time and place

Every Wednesday from 10.00-12.45 hour on location at Utrecht Science Park: Room HFG611. The format is 2x45 minutes lectures (10-10.45 hour, 11-11.45 hour), followed by 45 minutes laboratory class (12-12.45 hour), which is an interactive session. In between we have 15 minute breaks. I have made a set of 26 videos of length 10-20 minutes, which contain the most important parts of my lectures. The videos also have interactive questions and final questions with solutions given in a separate document (as a PDF file). The videos will become available through YouTube and a Utrecht University video repository. If you miss a class because you have a cold, or are otherwise ill disposed you can view these recordings instead, and contact me for interaction by any of the other electronic means (email, Teams). But nothing beats a live lecture, with human interaction, and important practicals, and you will get a lot more from class if you physically attend. First lecture: September 14, 2022.

Midterm exam: Oct 26, 2022, 10.00-12.00 hour. Room EDU ALFA in the Educatorium. Retake exam: Dec 21, 2022, 10.00-12.00 hour. Room Ruppert Paars. The exams will be closed book.

Intended audience

This is a Master's course which is an elective course for students of the Master's programme Mathematical Sciences of Utrecht University, part of the track Applied Mathematics, Complex Systems, and Scientific Computing. The course is recommended for students in Mathematics or Computer Science who are interested in scientific computing and would like to obtain a first `hands-on experience' in parallel programming. It is also highly suitable for the Honours master programme in Theoretical Physics and Mathematical Sciences and for students interested in Computational Physics. The course is part of the Dutch National Master Mastermath and hence is open for all interested master students in Mathematics in the Netherlands. Registration for everybody is through Mastermath.

Course aim

Students will learn how to design a parallel algorithm for a problem from the area of scientific computing and how to write a parallel program that solves the problem.

Learning goals

After completion of the course, the student is able to

Credits

You get 8 ECTS credit points.

Contents

Today, parallel computers are appearing on our desktops. The advent of dualcore, quadcore, and octacore computers and the expected increase in the number of cores in the coming years causes a major change in our approach to software, such as the mathematical software we use in scientific computations and in the emerging area of big data computations.

Parallel computers drastically increase our computational capabilities and thus enable us to model more realistically in many application areas. To make efficient use of parallel computers, it is necessary to reorganise the structure of our computational methods. In particular, attention must be paid to the division of work among the different processors solving a problem in parallel and to the communication between them. Suitable parallel algorithms and systems software are needed to realise the capabilities of parallel computers.

We will discuss extensively the most recent developments in the area of parallel computers, ranging from multicore laptop or desktop PCs to clusters of PCs connected by switching devices, to massively parallel computers with distributed memory such as our national supercomputer Snellius at SURF in Amsterdam, which was officially opened by our Queen on September 16, 2021.

The following subjects will be treated:

Language

The course will be given in English. All reading material will be in English.

Prerequisites

Introductory course in linear algebra. Some knowledge of algorithms. Knowledge of a modern programming language such as C, C++, C#, Python, or Java. Basic knowledge of C is helpful, as we will use this language in class, but we will organise the course in such a way that there will be time to adapt to it for those coming from another language. For those with little or no experience with C, this course is an opportunity to learn C, which is much faster than Python and hence commonly used in high performance computing. There are plenty of books on C; you may consider the book Practical C Programming by Steve Oualline. You can also use C++ in class, if you already know that language. Some knowledge of the UNIX/Linux operating system is helpful.

Software

We will make use of the MulticoreBSP for C software developed by Albert-Jan Yzelman, version 2.0.4 (March 2019). This software runs on shared-memory multicore PCs, and you can also run your program without any changes on distributed-memory machines such as Snellius.

C++ programmers can use Bulk, version 3.0.0 (August 19, 2021) as a communications library, instead of MulticoreBSP for C. It is a very modern library that can run on both shared-memory and distributed-memory machines, and on hybrids of the two, without any change. See Bulk: a Modern C++ Interface for Bulk-Synchronous Parallel Programs by Jan-Willem Buurlage, Tom Bannink, and Rob H. Bisseling, in Proceedings Euro-Par 2018, Lecture Notes in Computer Science, Vol. 11014, Springer, 2018, pp. 519-532. Published version.

Hardware

Recommended: Use your own device! Please install the MulticoreBSP library for C (in most cases) or Bulk for C++ (only if you are an experienced C++ user). On Macs and Linux computers this is straightforward. On Macs you have UNIX through your terminal. On Windows machines you may want to use the Windows Subsystem for Linux (WSL). For Windows, it is more difficult to get the software running and you may want to ask assistance.

You will get access to the national supercomputer Snellius, where you can use BSPonMPI which has been installed by SURF, or Bulk, both giving access to thousands of cores.

Examination

The examination is in the form of an initial assignment (20%), a final assignment (40%), a presentation on the final assignment (20%), and a midterm exam (20%). The two assignments are carried out at home. A written report must be returned for each assignment before the deadline. Students can work individually or in pairs (but not in larger teams) on the computer programs and can hand in a single report, provided each did a fair share of the work and can account for that. The contribution of each individual must then be stated briefly in the report. Each participant has to give an individual presentation on the chosen topic of the final assignment. Presentations will be scheduled on November 30, December 7 and 14, 2022. The midterm exam must be passed with a grade of 5.0 (on a scale of 10). All students should submit reports for the assignments electronically in PDF format through the ELO system of Mastermath. There is no further homework. All students must use LaTeX for the assignments.

The midterm exam can be retaken near the end of the course (in December) by any course participant. The maximum of the two exam grades counts as the midterm exam grade. In case of illness, or for another urgent reason, the final presentation can be retaken on an individual basis in January 2023. In case of an insufficient final grade for the whole course (<=5), the participant is entitled to a retake, which means that one assignment can be redone within six weeks after the final grading of the course.

Literature

We closely follow the book Parallel Scientific Computation: A Structured Approach using BSP (second edition) (PSC2), by Rob H. Bisseling, Oxford University Press, September 2020. ISBN 9780198788355 (for the paperback version). The book is available as paperback, hardback, ebook, and also in Oxford Scholarship Online. Bol.com has a small stock of the paperback version, and if you order early this will be quicker and probably cheaper. (At some point they will have to restock.)

Supplementary material such as all slides accompanying the book and the software package BSPedupack 2.0 with all the programs of the book are available on the supplementary material page. Here you can find all the lecture slides of this course and also pointers to the videos when they appear.

Weekly schedule and links

This year we will cover parts of chapters 1, 2, 4, 5 of the second edition (PSC2).

Wednesday September 14, 2022

The Bulk Synchronous Parallel model

(PSC2 section 1.2)

Parallel Inner Product Computation

(PSC2 section 1.3)

Laboratory class.

Try your luck on the theoretical exercise 1.2.

Interesting links

Wednesday September 21, 2022.

BSPlib, the Bulk Synchronous Parallel library.

(PSC2 section 1.4)

Benchmarking

(PSC2 sections 1.5-1.7)

Laboratory class. Starting with BSPlib.

This is your first hands-on session with BSPlib. Download the latest version of BSPedupack, my package of educational programs that teaches how to use BSP. Solve Exercise 1.3 from PSC2: try to run the benchmark, exploring your first parallel environment: your own laptop. Modify your program to send larger messages. How does this influence g?

Start working on the first assignment, Exercise 1.7 from PSC2. Design a parallel algorithm for prime number generation. Write your first parallel program. This program should generate all prime numbers below 10,000,000 in parallel. Choosing the right distribution is the name of the game. You should carry out the experiments of your first assignment on your own parallel computer. (This may be your laptop or desktop, or a compute server of your own university.) Snellius will only be available for the second assignment.

Hand in a report on Exercise 1.7 presenting your algorithm and the experimental results, before the deadline Wednesday November 2, 10.00 hour. Note that you can work in pairs.

Interesting links:

Wednesday September 28, 2022.

Parallel sample sort

(PSC2 section 1.8)

Program bspsort and bulk synchronous message passing

(PSC2 section 1.9)

Laboratory class. Discussion of Solution Exercise 1.2

Interesting links:

Wednesday October 5, 2022.

Sequential LU Decomposition

(PSC2 sections 2.1-2.2)

Parallel LU Decomposition

(PSC2 section 2.3)

Laboratory class. Discussion of report requirements

Wednesday October 12, 2022.

Two-Phase Broadcasting

(PSC2 section 2.4)

High-performance LU decomposition

(PSC2 section 2.5)

Laboratory class. Work on your assignment.

Interesting links:

Wednesday October 19, 2022.

Sequential sparse matrix–vector multiplication

(PSC2 section 4.1)

Sparse matrices and their data structures

(PSC2 section 4.2)

Laboratory class. Discussion of solution midterm exam from October 24, 2018

We will discuss all questions, which are representative for this year's closed-book exam. Please try first to solve the questions yourself.

Wednesday October 26, 2022. Location: Educatorium Alfa

10.00-12.00 Midterm exam on Chapters 1 and 2.

The exact exam material will be announced in the study guide. Good material to practice are also the final questions at the end of my videos. The solutions are provided in a separate PDF document.

Interesting links: previous exams from the archive. Exams from 2012, 2013 are only on Chapter 1 of the material (60 minutes). Exams 2017 on Chapter 1, 3 (90 minutes).

Wednesday November 2, 2022.

Parallel sparse matrix-vector multiplication

(PSC2 section 4.3)

Cartesian matrix distributions

(PSC2 section 4.4)

Laboratory class

Choose an assignment for your final project. Possibilities: You can parallelise your own problem, or choose one of the exercises from the book PSC2. For example:

Please try to discuss the chosen topic with the teacher before November 16, 2022 and register the topic by sending an email to the teacher. You can work in pairs, but have to give individual presentations on the topic. It is OK for teams to choose the same topic, but please try to be original in your approach.

Wednesday November 9, 2022. Changed location: BBG065

Mondriaan sparse matrix distribution

(PSC2 section 4.5)

Medium-grain method for sparse matrix distribution

(PSC2 section 4.6)

Laboratory class. Discussion of the midterm exam

Interesting links:

Wednesday November 16, 2022

Laplacian matrices

(PSC2 section 4.9)

Sequential graph matching

(PSC2 sections 5.1-5.2)

Exercise class. Discussion of the first assignment.

Wednesday November 23, 2022.

Suitors and sorting

(PSC2 section 5.3)

Parallel graph matching

(PSC2 section 5.4)

Laboratory class

Wednesday November 30, 2022.

Presentations final project

Laboratory class

Wednesday December 7, 2022.

Presentations final project

Laboratory class

Wednesday December 14, 2022.

Excursion to Snellius at Surf in Amsterdam

We got a guided tour of the supercomputer Snellius and attended a lecture by Carlos Teijeiro Barjas from Surf on "High Performance Computing and Parallelization Strategies". The picture at the top of this page was taken in front of Snellius, and it shows both the GPU partition (left) and a part of the CPU partition.

Wednesday December 21, 2022. Location Ruppert Paars

10.00-12.00 Retake midterm exam

Deadline second assignment Monday January 16, 2023, 12.00 hour.

Hand in a report (through the Mastermath ELO, in PDF) before the deadline. Only one report per team. No need to send in duplicates. Please use the batch system of Snellius and not the interactive system for test runs of your parallel program. The interactive system is only for development. Running your parallel program on hundreds of processor cores will give you a bonus.

Please fill out the Mastermath digital evaluation form, which will be sent to you around the deadline.

Frequently asked questions


Last update of this page: December 16, 2022
Back to homepage of Rob Bisseling