Predicting transcription factor activities from combined analysis of microarray and ChIP data: a partial least squares approach
Open Access
- 24 June 2005
- journal article
- research article
- Published by Springer Nature in Theoretical Biology and Medical Modelling
- Vol. 2 (1) , 23
- https://doi.org/10.1186/1742-4682-2-23
Abstract
Background: The study of the network between transcription factors and their targets is important for understanding the complex regulatory mechanisms in a cell. Unfortunately, with standard microarray experiments it is not possible to measure the transcription factor activities (TFAs) directly, as their own transcription levels are subject to post-translational modifications. Results: Here we propose a statistical approach based on partial least squares (PLS) regression to infer the true TFAs from a combination of mRNA expression and DNA-protein binding measurements. This method is also statistically sound for small samples and allows the detection of functional interactions among the transcription factors via the notion of "meta"-transcription factors. In addition, it enables false positives to be identified in ChIP data and activation and suppression activities to be distinguished. Conclusion: The proposed method performs very well both for simulated data and for real expression and ChIP data from yeast and E. Coli experiments. It overcomes the limitations of previously used approaches to estimating TFAs. The estimated profiles may also serve as input for further studies, such as tests of periodicity or differential regulation. An R package "plsgenomics" implementing the proposed methods is available for download from the CRAN archive.Keywords
This publication has 32 references indexed in Scilit:
- Transcriptional regulatory code of a eukaryotic genomeNature, 2004
- Extracting novel information from gene expression dataTrends in Biotechnology, 2004
- Is cross-validation valid for small-sample microarray classification?Bioinformatics, 2004
- PLS Dimension Reduction for Classification with Microarray DataStatistical Applications in Genetics and Molecular Biology, 2004
- Module networks: identifying regulatory modules and their condition-specific regulators from gene expression dataNature Genetics, 2003
- Transcriptional Regulatory Networks in Saccharomyces cerevisiaeScience, 2002
- Comparison of Discrimination Methods for the Classification of Tumors Using Gene Expression DataJournal of the American Statistical Association, 2002
- Genomic binding sites of the yeast cell-cycle transcription factors SBF and MBFNature, 2001
- Genome-Wide Location and Function of DNA Binding ProteinsScience, 2000
- A Statistical View of Some Chemometrics Regression ToolsTechnometrics, 1993