Closing the Gap: CPU and FPGA Trends in Sustainable Floating-Point BLAS Performance

23 December 2004

proceedings article
Published by Institute of Electrical and Electronics Engineers (IEEE)

p. 219-228
https://doi.org/10.1109/fccm.2004.21

Abstract

Field programmable gate arrays (FPGAs) have long been an attractive alternative to microprocessors for computing tasks - as long as floating-point arithmetic is not required. Fueled by the advance of Moore's Law, FPGAs are rapidly reaching sufficient densities to enhance peak floating-point performance as well. The question, however, is how much of this peak performance can be sustained. This paper examines three of the basic linear algebra sub-routine (BLAS) functions: vector dot product, matrix-vector multiply, and matrix multiply. A comparison of microprocessors, FPGAs, and Reconfigurable Computing platforms is performed for each operation. The analysis highlights the amount of memory bandwidth and internal storage needed to sustain peak performance with FPGAs. This analysis considers the historical context of the last six years and is extrapolated for the next six years.

Keywords

This publication has 14 references indexed in Scilit:

IEEE Standard for Binary Floating-Point Arithmetic
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2008
FPGAs vs. CPUs
Published by Association for Computing Machinery (ACM) ,2004
Floating point unit generation and evaluation for FPGAs
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Automatic floating to fixed point translation and its application to post-rendering 3D warping
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2003
Applications of adaptive computing systems for signal processing challenges
Published by Association for Computing Machinery (ACM) ,2003
A re-evaluation of the practicality of floating-point operations on FPGAs
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Quantitative analysis of floating point arithmetic on FPGA based custom computing machines
Published by Institute of Electrical and Electronics Engineers (IEEE) ,2002
Automated empirical optimizations of software and the ATLAS project
Parallel Computing, 2001
Accelerating pipelined integer and floating-point accumulations in configurable hardware with delayed addition techniques
IEEE Transactions on Computers, 2000
Implementation of IEEE single precision floating point addition and multiplication on FPGAs
Published by Institute of Electrical and Electronics Engineers (IEEE) ,1996