Typicality, Diversity, and Feature Pattern of an Ensemble

Abstract
In this paper, issues concerning feature patterns in terms of both feature composition and feature interdependence are discussed, and the concepts of typicality and diversity of an ensemble are formulated. The features of the specimens investigated are organized in a two-dimensional array, called an observation matrix, with each row vector representing the ordered set of features of a specimen. An algorithm (based upon the proposed measures and statistical screening) is implemented for extracting feature patterns. In the algorithm, schemes for feature patterns and specimen reweighting are proposed to optimize the utilization of available information in the array, and to minimize possible bias caused by the uneven sampling of the ensemble. Two sets of real world data in the environmental and molecular biology areas are used to exemplify the physical meaning of the proposed measures as well as to demonstrate the operational feasibility and significance of this methodology in analyzing homologous ensemble which is subject to variable degrees of diversity.