Continuous Representations of Time-Series Gene Expression Data
- 1 June 2003
- journal article
- research article
- Published by Mary Ann Liebert Inc in Journal of Computational Biology
- Vol. 10 (3-4) , 341-356
- https://doi.org/10.1089/10665270360688057
Abstract
We present algorithms for time-series gene expression analysis that permit the principled estimation of unobserved time points, clustering, and dataset alignment. Each expression profile is modeled as a cubic spline (piecewise polynomial) that is estimated from the observed data and every time point influences the overall smooth expression curve. We constrain the spline coefficients of genes in the same class to have similar expression patterns, while also allowing for gene specific parameters. We show that unobserved time points can be reconstructed using our method with 10-15% less error when compared to previous best methods. Our clustering algorithm operates directly on the continuous representations of gene expression profiles, and we demonstrate that this is particularly effective when applied to nonuniformly sampled data. Our continuous alignment algorithm also avoids difficulties encountered by discrete approaches. In particular, our method allows for control of the number of degrees of freedom of the warp through the specification of parameterized functions, which helps to avoid overfitting. We demonstrate that our algorithm produces stable low-error alignments on real expression data and further show a specific application to yeast knock-out data that produces biologically meaningful results.Keywords
This publication has 12 references indexed in Scilit:
- Serial Regulation of Transcriptional Regulators in the Yeast Cell CycleCell, 2001
- Aligning gene expression time series with time warping algorithmsBioinformatics, 2001
- Missing value estimation methods for DNA microarraysBioinformatics, 2001
- Dynamic modeling of gene expression dataProceedings of the National Academy of Sciences, 2001
- Fundamental patterns underlying gene expression profiles: Simplicity from complexityProceedings of the National Academy of Sciences, 2000
- Cluster analysis and display of genome-wide expression patternsProceedings of the National Academy of Sciences, 1998
- Comprehensive Identification of Cell Cycle–regulated Genes of the YeastSaccharomyces cerevisiaeby Microarray HybridizationMolecular Biology of the Cell, 1998
- The Transcriptional Program of Sporulation in Budding YeastScience, 1998
- Smoothing Spline Models for the Analysis of Nested and Crossed Samples of CurvesJournal of the American Statistical Association, 1998
- Speech recognition using hidden Markov models with polynomial regression functions as nonstationary statesIEEE Transactions on Speech and Audio Processing, 1994