A framework for research in database management for statistical analysis or a primer on statistical database management problems for computer scientists

Abstract
This paper is intended to introduce those familiar with database management issues to the problems of managing large statistical databases. We begin with a characterization of statistical databases based on the structure and use of the data in the database. Several data management problems are then described. In particular, we discuss the problem of repetitive computations on large segments of the database during the lifetime of a statistical analysis. The organization of a data management system which avoids this problem by caching previously computed results and automatically maintaining their integrity is presented. We conclude with a list of problems that this organization raises and a discussion of related work.

This publication has 0 references indexed in Scilit: