Floating-Point Matrix Multiplication in a Polymorphic Processor
- 1 December 2007
- conference paper
- Published by Institute of Electrical and Electronics Engineers (IEEE)
- p. 249-252
- https://doi.org/10.1109/fpt.2007.4439258
Abstract
We consider 64-bit floating-point matrix multiplication in the context of polymorphic processor architectures. Our proposal provides a complete and performance efficient solution of the matrix multiplication problem, including hardware design and software interface. We adopt previous ideas1, originally proposed for loosely coupled processors and message passing communications. We employ these ideas into a tightly coupled custom computing unit (CCU) in the Molen polymorphic processor. Furthermore, we introduce a controller, which facilitates the efficient operation of the multiplier processing elements (PEs) in a polymorphic environment. The design is evaluated theoretically and through real hardware experiments. More precisely, we fit 9 processing elements in an XC2VP30-6 device; this configuration suggests theoretical peak performance of 1.80 GFLOPS. In practice, we measured sustained performance of up to 1.79 GFLOPS for the matrix multiplication on real hardware, including the software overhead. Theoretical analysis and experimental results suggest that the design efficiency scales better for large problem sizes.Keywords
This publication has 5 references indexed in Scilit:
- Scalable and Modular Algorithms for Floating-Point Matrix Multiplication on Reconfigurable Computing SystemsIEEE Transactions on Parallel and Distributed Systems, 2007
- 64-bit floating-point FPGA matrix multiplicationPublished by Association for Computing Machinery (ACM) ,2005
- The MOLEN polymorphic processorIEEE Transactions on Computers, 2004
- GEMM-based level 3 BLASACM Transactions on Mathematical Software, 1998
- A set of level 3 basic linear algebra subprogramsACM Transactions on Mathematical Software, 1990