Incremental hierarchical memory size estimation for steering of loop transformations
- 1 September 2007
- journal article
- Published by Association for Computing Machinery (ACM) in ACM Transactions on Design Automation of Electronic Systems
- Vol. 12 (4) , 50
- https://doi.org/10.1145/1278349.1278363
Abstract
Modern embedded multimedia and telecommunications systems need to store and access huge amounts of data. This becomes a critical factor for the overall energy consumption, area, and performance of the systems. Loop transformations are essential to improve the data access locality and regularity in order to optimally design or utilize a memory hierarchy. However, due to abstract high-level cost functions, current loop transformation steering techniques do not take the memory platform sufficiently into account. They usually also result in only one final transformation solution. On the other hand, the loop transformation search space for real-life applications is huge, especially if the memory platform is still not fully fixed. Use of existing loop transformation techniques will therefore typically lead to suboptimal end-products. It is critical to find all interesting loop transformation instances. This can only be achieved by performing an evaluation of the effect of later design stages at the early loop transformation stage. This article presents a fast incremental hierarchical memory-size requirement estimation technique. It estimates the influence of any given sequence of loop transformation instances on the mapping of application data onto a hierarchical memory platform. As the exact memory platform instantiation is often not yet defined at this high-level design stage, a platform-independent estimation is introduced with a Pareto curve output for each loop transformation instance. Comparison among the Pareto curves helps the designer, or a steering tool, to find all interesting loop transformation instances that might later lead to low-power data mapping for any of the many possible memory hierarchy instances. Initially, the source code is used as input for estimation. However, performing the estimation repeatedly from the source code is too slow for large search space exploration. An incremental approach, based on local updating of the previous result, is therefore used to handle sequences of different loop transformations. Experiments show that the initial approach takes a few seconds, which is two orders of magnitude faster than state-of-the-art solutions but still too costly to be performed interactively many times. The incremental approach typically takes just a few milliseconds, which is another two orders of magnitude faster than the initial approach. This huge speedup allows us for the first time to handle real-life industrial-size applications and get realistic feedback during loop transformation exploration.Keywords
This publication has 23 references indexed in Scilit:
- Semi-Automatic Composition of Loop Transformations for Deep Parallelism and Memory HierarchiesInternational Journal of Parallel Programming, 2006
- Data Access and Storage Management for Embedded Programmable ProcessorsPublished by Springer Nature ,2002
- On the complexity of loop fusionParallel Computing, 2000
- Increasing energy efficiency of embedded systems by application-specific memory hierarchy generationIEEE Design & Test of Computers, 2000
- Custom Memory Management MethodologyPublished by Springer Nature ,1998
- Affine-by-Statement Scheduling of Uniform and Affine Loop Nests over Parametric DomainsJournal of Parallel and Distributed Computing, 1995
- Background memory area estimation for multidimensional signal processing systemsIEEE Transactions on Very Large Scale Integration (VLSI) Systems, 1995
- Compiler transformations for high-performance computingACM Computing Surveys, 1994
- Loop Transformations for Restructuring Compilers: The FoundationsPublished by Springer Nature ,1993
- A study of replacement algorithms for a virtual-storage computerIBM Systems Journal, 1966