Nonlinear array layouts for hierarchical memory systems

Abstract
Programming languages that provide multidimensional arrays and a flat linear model of memory must implement a mapping between these two domains to order array elements in memory. This layout function is fixed at language definition time and constitutesan in- visible, non-programmable array attribute. In reality, mo dern mem- ory systems are architecturally hierarchical rather than fl at, with substantial differences in performance among different le vels of the hierarchy. This mismatch between the model and the true archi- tecture of memory systems can result in low locality of reference and poor performance. Some of this loss in performance can be recovered by re-ordering computations using transformati ons such as loop tiling. We explore nonlinear array layout functions as an additional means of improving locality of reference. For a b ench- mark suite composed of dense matrix kernels, we show by timing and simulation that two specific layouts (4D and Morton) havelow implementation costs (2-5% of total running time) and high perfor- mance benefits (reducing execution time by factors of 1.1-2.5); that they have smooth performance curves, both across a wide range of problem sizes and over representative cache architectures ; and that recursion-based control structures may be needed to fully e xploit their potential.

This publication has 39 references indexed in Scilit: