Nonlinear array layouts for hierarchical memory systems
- 1 May 1999
- proceedings article
- Published by Association for Computing Machinery (ACM)
- p. 444-453
- https://doi.org/10.1145/305138.305231
Abstract
Programming languages that provide multidimensional arrays and a flat linear model of memory must implement a mapping between these two domains to order array elements in memory. This layout function is fixed at language definition time and constitutesan in- visible, non-programmable array attribute. In reality, mo dern mem- ory systems are architecturally hierarchical rather than fl at, with substantial differences in performance among different le vels of the hierarchy. This mismatch between the model and the true archi- tecture of memory systems can result in low locality of reference and poor performance. Some of this loss in performance can be recovered by re-ordering computations using transformati ons such as loop tiling. We explore nonlinear array layout functions as an additional means of improving locality of reference. For a b ench- mark suite composed of dense matrix kernels, we show by timing and simulation that two specific layouts (4D and Morton) havelow implementation costs (2-5% of total running time) and high perfor- mance benefits (reducing execution time by factors of 1.1-2.5); that they have smooth performance curves, both across a wide range of problem sizes and over representative cache architectures ; and that recursion-based control structures may be needed to fully e xploit their potential.Keywords
This publication has 39 references indexed in Scilit:
- Automatic data layout for distributed-memory machinesACM Transactions on Programming Languages and Systems, 1998
- Active memoryACM Transactions on Modeling and Computer Simulation, 1997
- Can parallel algorithms enhance serial implementation?Communications of the ACM, 1996
- Dynamic partitioning of non-uniform structured workloads with spacefilling curvesIEEE Transactions on Parallel and Distributed Systems, 1996
- The influence of caches on the performance of heapsACM Journal of Experimental Algorithmics, 1996
- Influence of cross-interferences on blocked loopsACM Transactions on Programming Languages and Systems, 1995
- Optimal evaluation of array expressions on massively parallel machinesACM Transactions on Programming Languages and Systems, 1995
- Footprints in the cacheACM Transactions on Computer Systems, 1987
- Ueber die stetige Abbildung einer Line auf ein Fl chenst ckMathematische Annalen, 1891
- Sur une courbe, qui remplit toute une aire planeMathematische Annalen, 1890