A 90mW/GFlop 3.4GHz Reconfigurable Fused/Continuous Multiply-Accumulator for Floating-Point and Integer Operands in 65nm

Abstract
This paper describes energy efficient and reconfigurable fused/continuous Multiply-Accumulator (MAC) architecture for single-precision Floating-point and 16-bit signed integer operands. This eight-stage pipelined and single-cycle throughput MAC design contains a bit level pipelined multiplier, followed by fast sparse-tree adder and single cycle accumulator loop with delayed normalization logic. Operation driven energy control is achieved using dynamic clock and fine grained power gating techniques. Power gating is employed in 98% of design to save 79% of leakage power in idle mode, at 1.2 V supply and 110 C. The use of fully shared logic in the multiplier, accumulator and normalization blocks for different operations enables a compact design of 0.54 mm 2 containing 117 K transistors in eight-metal 65 nm CMOS technology. The 15-FO4 design provides 6.8 GFLOPS of performance with total energy efficiency of 90 mW/GFLOP at 1.2V and 3.4 GHz operation.

This publication has 2 references indexed in Scilit: