Exploiting fast matrix multiplication within the level 3 BLAS
- 1 December 1990
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Mathematical Software
- Vol. 16 (4) , 352-368
- https://doi.org/10.1145/98267.98290
Abstract
The Level 3 BLAS (BLAS3) are a set of specifications of FORTRAN 77 subprograms for carrying out matrix multiplications and the solution of triangular systems with multiple right-hand sides. They are intended to provide efficient and portable building blocks for linear algebra algorithms on high-performance computers. We describe algorithms for the BLAS3 operations that are asymptotically faster than the conventional ones. These algorithms are based on Strassen's method for fast matrix multiplication, which is now recognized to be a practically useful technique once matrix dimensions exceed about 100. We pay particular attention to the numerical stability of these “fast BLAS3.” Error bounds are given and their significance is explained and illustrated with the aid of numerical experiments. Our conclusion is that the fast BLAS3, although not as strongly stable as conventional implementations, are stable enough to merit careful consideration in many applications.Keywords
This publication has 12 references indexed in Scilit:
- Fast Polar Decomposition of an Arbitrary MatrixSIAM Journal on Scientific and Statistical Computing, 1990
- A set of level 3 basic linear algebra subprogramsACM Transactions on Mathematical Software, 1990
- Algorithm 679: A set of level 3 basic linear algebra subprograms: model implementation and test programsACM Transactions on Mathematical Software, 1990
- The Accuracy of Solutions to Triangular SystemsSIAM Journal on Numerical Analysis, 1989
- Extra High Speed Matrix Multiplication on the Cray-2SIAM Journal on Scientific and Statistical Computing, 1988
- Impact of Hierarchical Memory Systems On Linear Algebra Algorithm DesignThe International Journal of Supercomputing Applications, 1988
- The Use of BLAS3 in Linear Algebra on a Parallel Processor with a Hierarchical MemorySIAM Journal on Scientific and Statistical Computing, 1987
- Further Comparisons of Direct Methods for Computing Stationary Distributions of Markov ChainsSIAM Journal on Algebraic Discrete Methods, 1987
- Stability of fast algorithms for matrix multiplicationNumerische Mathematik, 1980
- Computational Complexity and Numerical StabilitySIAM Journal on Computing, 1975