Minimizing Information Loss in Simple Aggregation

Abstract
A criterion is presented to indicate a strategy for additively combining corresponding discrete rows and columns of a square array to form an array of reduced size such that the loss of information thereby incurred is minimized. The objective function is expressed in terms of an information loss criterion between the original array and an array of the same size acting as a surrogate for the reduced array. In contrast to the usual application of information theory in planning, where arrays are filled out based on incomplete information, the converse case is implied here, in which a particular partial representation of the array is sought (that is, the aggregated array) which most closely approaches the information content of the original full array. The method is expected to be useful for situations where available large arrays of data need to be appropriately reduced in size so that they can be efficiently used in further computations. Alternatively, the original array may contain current observed data, and the primary aim may be to determine from this array the aggregation scheme to use in future forecasting calculations. Typical applications may include the aggregation of spatial zones, the aggregation of economic sectors in input-output analysis, as well as various clustering problems.

This publication has 7 references indexed in Scilit: