A flexible class of parallel matrix multiplication algorithms
- 27 November 2002
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- No. 10637133,p. 110-116
- https://doi.org/10.1109/ipps.1998.669898
Abstract
This paper explains why parallel implementation of matrix multiplication, a seemingly simple algorithm that can be expressed as one statement and three nested loops, is complex. Practical algorithms that use matrix multiplication tend to use matrices of disparate shapes, and the shape of the matrices can significantly impact the performance of matrix multiplication. We provide a class of algorithms that covers the spectrum of shapes encountered and demonstrate that good performance can be attained if the right algorithm is chosen. While the paper resolves a number of issues, it concludes with discussion of a number of directions yet to be pursued.Keywords
This publication has 11 references indexed in Scilit:
- A matrix product algorithm and its comparative performance on hypercubesPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- ScaLAPACK: a scalable linear algebra library for distributed memory concurrent computersPublished by Institute of Electrical and Electronics Engineers (IEEE) ,2003
- Interprocessor collective communication library (InterCom)Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
- A poly-algorithm for parallel dense matrix multiplication on two-dimensional process grid topologiesConcurrency: Practice and Experience, 1997
- SUMMA: scalable universal matrix multiplication algorithmConcurrency: Practice and Experience, 1997
- A three-dimensional approach to parallel matrix multiplicationIBM Journal of Research and Development, 1995
- A high-performance matrix-multiplication algorithm on a distributed-memory parallel computer, using overlapped communicationIBM Journal of Research and Development, 1994
- Matrix multiplication on the Intel Touchstone DeltaConcurrency: Practice and Experience, 1994
- Pumma: Parallel universal matrix multiplication algorithms on distributed memory concurrent computersConcurrency: Practice and Experience, 1994
- Matrix algorithms on a hypercube I: Matrix multiplicationParallel Computing, 1987