Accelerating matrix product on reconfigurable hardware for image processing applications

Abstract
Matrix multiplication is very important in many types of applications including image and signal processing. The suitability of reconfigurable hardware devices, in the form of field programmable gate arrays (FPGAs), is investigated as a low-cost solution for implementing two matrix multipliers for 3-D affine transformations and colour space conversion. A first solution based on processing large matrix multiplication, for large 3-D models, and for the evaluation of the Celoxica fixed-point library and Xilinx CoreGen performance has been reported. A novel architecture for efficient implementation of a colour space converter (CSC) based on distributed arithmetic (DA) principles has been presented. The two multipliers have been developed and implemented on the RC1000-PP Celoxica board-based development platform. Results show that the FPGA-based first parallel multiplier can achieve the performance of a graphics card when performing 3-D affine transformations, while the second multiplier, which is fully pipelined and platform-independent, has a low latency (8 cycles) and is capable of a sustained data rate of over 234 mega-conversions per second.

This publication has 6 references indexed in Scilit: