Optimal disk I/O with parallel block transfer

Abstract
We provide optimal algorithms for sort- ing, FFT, matrix transposition, standard matrix mul- tiplication, and related problems in terms of the num- ber of input/outputs (I/Os) required between internal memory and secondary storage. Our two-level memory model is new and gives a realistic treatment of paral- lel block transfer, in which during a single I/O each of the P secondary storage devices (disks) can simultane- ously transfer a contiguous block of B records. We also introduce parallel variants of the hierarchical memory models of (1,2) and give optimal algorithms. In our models there are P hierarchies, which operate in paral- lel. Communication among the hierarchies takes place at a base memory level. The difficulty in developing optimal algorithms in our two-level and hierarchical models is to cope with the partitioning of memory into P separate physical devices. Our sorting algorithms are randomized, but practical; the probability of using more than an optimal number of I/Os falls off exponentially.