A flexible class of parallel matrix multiplication algorithms

27 November 2002

conference paper
Published by Institute of Electrical and Electronics Engineers (IEEE)

No. 10637133,p. 110-116
https://doi.org/10.1109/ipps.1998.669898

Abstract

This paper explains why parallel implementation of matrix multiplication, a seemingly simple algorithm that can be expressed as one statement and three nested loops, is complex. Practical algorithms that use matrix multiplication tend to use matrices of disparate shapes, and the shape of the matrices can significantly impact the performance of matrix multiplication. We provide a class of algorithms that covers the spectrum of shapes encountered and demonstrate that good performance can be attained if the right algorithm is chosen. While the paper resolves a number of issues, it concludes with discussion of a number of directions yet to be pursued.

Keywords

This publication has 11 references indexed in Scilit:

A matrix product algorithm and its comparative performance on hypercubes
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computers
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Interprocessor collective communication library (InterCom)
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
A poly-algorithm for parallel dense matrix multiplication on two-dimensional process grid topologies
Concurrency: Practice and Experience, 1997
SUMMA: scalable universal matrix multiplication algorithm
Concurrency: Practice and Experience, 1997
A three-dimensional approach to parallel matrix multiplication
IBM Journal of Research and Development, 1995
A high-performance matrix-multiplication algorithm on a distributed-memory parallel computer, using overlapped communication
IBM Journal of Research and Development, 1994
Matrix multiplication on the Intel Touchstone Delta
Concurrency: Practice and Experience, 1994
Pumma: Parallel universal matrix multiplication algorithms on distributed memory concurrent computers
Concurrency: Practice and Experience, 1994
Matrix algorithms on a hypercube I: Matrix multiplication
Parallel Computing, 1987