OSKI: A library of automatically tuned sparse matrix kernels
Top Cited Papers
- 1 January 2005
- journal article
- Published by IOP Publishing in Journal of Physics: Conference Series
- Vol. 16, 521-530
- https://doi.org/10.1088/1742-6596/16/1/071
Abstract
The Optimized Sparse Kernel Interface (OSKI) is a collection of low-level primitives that provide automatically tuned computational kernels on sparse matrices, for use by solver libraries and applications. These kernels include sparse matrix-vector multiply and sparse triangular solve, among others. The primary aim of this interface is to hide the complex decision- making process needed to tune the performance of a kernel implementation for a particular user's sparse matrix and machine, while also exposing the steps and potentially non-trivial costs of tuning at run-time. This paper provides an overview of OSKI, which is based on our research on automatically tuned sparse kernels for modern cache-based superscalar machines. 1. Goals and Motivation We describe the Optimized Sparse Kernel Interface (OSKI), a collection of low-level primitives that provide automatically tuned computational kernels on sparse matrices, for use by solver libraries and applications. The kernels include sparse matrix-vector multiply (SpMV) and sparse triangular solve (SpTS), among others; "tuning" refers to the process of selecting the data structure and code transformations that lead to the fastest implementation of a kernel, given a machine and matrix. While conventional implementations of SpMV have historically run at 10% of machine peak or less, careful tuning can achieve up to 31% of peak and 4◊ speedups (1, Chap. 1). The challenge is that we must often defer tuning until run-time, since the matrix may be unknown until then. The need for run-time tuning diers from the case of dense kernelsKeywords
This publication has 13 references indexed in Scilit:
- Sparsity: Optimization Framework for Sparse Matrix KernelsThe International Journal of High Performance Computing Applications, 2004
- An updated set of basic linear algebra subprograms (BLAS)ACM Transactions on Mathematical Software, 2002
- An overview of the sparse basic linear algebra subprogramsACM Transactions on Mathematical Software, 2002
- Algorithm 818ACM Transactions on Mathematical Software, 2002
- Automated empirical optimizations of software and the ATLAS projectParallel Computing, 2001
- PSBLASACM Transactions on Mathematical Software, 2000
- The automatic generation of sparse primitivesACM Transactions on Mathematical Software, 1998
- Efficient Management of Parallelism in Object-Oriented Numerical Software LibrariesPublished by Springer Nature ,1997
- Algorithm‐oriented generic librariesSoftware: Practice and Experience, 1994
- Sparse Matrices in MATLAB: Design and ImplementationSIAM Journal on Matrix Analysis and Applications, 1992