Secondary index optimization

Abstract
A major problem to be considered in the design of a data base is that of index selection. In this paper we present a model of a data base together with a probabilistic model for the transactions conducted with the data base: queries and updates. We obtain characterizations of the optimal solution to the best choice of indices. An algorithm is shown to solve this problem, which, in a number of cases, has a running time of O(m log m), where m is the number of attributes of the file.