Polynomial-time algorithm for on-chip scratchpad memory partitioning
- 30 October 2003
- proceedings article
- Published by Association for Computing Machinery (ACM)
- p. 318-326
- https://doi.org/10.1145/951710.951751
Abstract
Focusing on embedded applications, scratchpad memories (SPMs) look like a best-compromise solution when taking into account performance, energy consumption and die area. The main challenge in SPM design is mapping memory locations to scratchpad locations. This paper describes an algorithm to optimally solve such a mapping problem by means of Dynamic Programming applied to a synthesizable hardware architecture. The algorithm works by mapping segments of external memory to physically partitioned banks of an on-chip SPM; this architecture provides significant energy savings. The algorithm does not require any user-set bound on the number of partitions and takes into account partitioning overhead. Improving on previous solutions, execution time is polynomial in the input size. Strategies to optimize memory requirements and speed of the algorithm are exploited. Additionally, we integrate this algorithm in a complete and automated design, simulation and synthesis flow.Keywords
This publication has 13 references indexed in Scilit:
- Scratchpad memoryPublished by Association for Computing Machinery (ACM) ,2002
- Compiler-directed scratch pad memory hierarchy design and managementProceedings of the 39th conference on Design automation - DAC '02, 2002
- Exploiting shared scratch pad memory space in embedded multiprocessor systemsProceedings of the 39th conference on Design automation - DAC '02, 2002
- A 250-MHz single-chip multiprocessor for audio and video signal processingIEEE Journal of Solid-State Circuits, 2001
- Data memory organization and optimizations in application-specific systemsIEEE Design & Test of Computers, 2001
- Power-aware partitioned cache architecturesPublished by Association for Computing Machinery (ACM) ,2001
- Dynamic management of scratch-pad memory spacePublished by Association for Computing Machinery (ACM) ,2001
- Increasing energy efficiency of embedded systems by application-specific memory hierarchy generationIEEE Design & Test of Computers, 2000
- A microprocessor with a 128-bit CPU, ten floating-point MAC's, four floating-point dividers, and an MPEG-2 decoderIEEE Journal of Solid-State Circuits, 1999
- Local memory exploration and optimization in embedded systemsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems, 1999