A Rank-by-Feature Framework for Interactive Exploration of Multidimensional Data
- 1 June 2005
- journal article
- Published by SAGE Publications in Information Visualization
- Vol. 4 (2) , 96-113
- https://doi.org/10.1057/palgrave.ivs.9500091
Abstract
Interactive exploration of multidimensional data sets is challenging because: (1) it is difficult to comprehend patterns in more than three dimensions, and (2) current systems often are a patchwork of graphical and statistical methods leaving many researchers uncertain about how to explore their data in an orderly manner. We offer a set of principles and a novel rank-by-feature framework that could enable users to better understand distributions in one (1D) or two dimensions (2D), and then discover relationships, clusters, gaps, outliers, and other features. Users of our framework can view graphical presentations (histograms, boxplots, and scatterplots), and then choose a feature detection criterion to rank 1D or 2D axis-parallel projections. By combining information visualization techniques (overview, coordination, and dynamic query) with summaries and statistical methods users can systematically examine the most important 1D and 2D axis-parallel projections. We summarize our Graphics, Ranking, and Interaction for Discovery (GRID) principles as: (1) study 1D, study 2D, then find features (2) ranking guides insight, statistics confirm. We implemented the rank-by-feature framework in the Hierarchical Clustering Explorer, but the same data exploration principles could enable users to organize their discovery process so as to produce more thorough analyses and extract deeper insights in any multidimensional data application, such as spreadsheets, statistical packages, or information visualization tools.Keywords
This publication has 24 references indexed in Scilit:
- Coordinating Computational and Visual Approaches for Interactive Feature Selection and Multivariate ClusteringInformation Visualization, 2003
- Fast algorithms for projected clusteringPublished by Association for Computing Machinery (ACM) ,1999
- HD-Eye: visual mining of high-dimensional dataIEEE Computer Graphics and Applications, 1999
- Automatic subspace clustering of high dimensional data for data mining applicationsPublished by Association for Computing Machinery (ACM) ,1998
- Feature Selection for Knowledge Discovery and Data MiningPublished by Springer Nature ,1998
- Exploratory Projection PursuitJournal of the American Statistical Association, 1987
- Projection PursuitThe Annals of Statistics, 1985
- The Grand Tour: A Tool for Viewing Multidimensional DataSIAM Journal on Scientific and Statistical Computing, 1985
- Graphics and Graphic Information ProcessingPublished by Walter de Gruyter GmbH ,1981
- A Projection Pursuit Algorithm for Exploratory Data AnalysisIEEE Transactions on Computers, 1974