Abstract
In this paper a new concurrent architecture based on index partition and CORDIC techniques for implementing discrete cosine transforms (DCT) (with power of two length) is proposed. This architecture works basically in a serial-in parallel-out mode. In this newly proposed architecture, three stages of pipelining are applicable and the throughput rate is improved. Each processing element (PE) is basically a CORDIC processor with a fixed angle rotation and only N/2 PEs are required for computing an N-point DCT. Since each stage of the pipeline is nearly balanced, the concurrency of pipelining is explored as much as possible. The throughput rate of this architecture is (N + 2) / NT, where N is the transform length and T the system clock period. Therefore, in this architecture, the clock period can easily be pushed up to 50 ns in VLSI chips and the throughput rate would be 17–7 MHz for N = 16. Thus, this newly proposed architecture provides the posibility of real-time computations.

This publication has 10 references indexed in Scilit: