oPOSSUM: identification of over-represented transcription factor binding sites in co-expressed genes
Top Cited Papers
Open Access
- 2 June 2005
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 33 (10) , 3154-3164
- https://doi.org/10.1093/nar/gki624
Abstract
Targeted transcript profiling studies can identify sets of co-expressed genes; however, identification of the underlying functional mechanism(s) is a significant challenge. Established methods for the analysis of gene annotations, particularly those based on the Gene Ontology, can identify functional linkages between genes. Similar methods for the identification of over-represented transcription factor binding sites (TFBSs) have been successful in yeast, but extension to human genomics has largely proved ineffective. Creation of a system for the efficient identification of common regulatory mechanisms in a subset of co-expressed human genes promises to break a roadblock in functional genomics research. We have developed an integrated system that searches for evidence of co-regulation by one or more transcription factors (TFs). oPOSSUM combines a pre-computed database of conserved TFBSs in human and mouse promoters with statistical methods for identification of sites over-represented in a set of co-expressed genes. The algorithm successfully identified mediating TFs in control sets of tissue-specific genes and in sets of co-expressed genes from three transcript profiling studies. Simulation studies indicate that oPOSSUM produces few false positives using empirically defined thresholds and can tolerate up to 50% noise in a set of co-expressed genes.Keywords
This publication has 42 references indexed in Scilit:
- Mouse Brain Organization Revealed Through Direct Genome-Scale TF Expression AnalysisScience, 2004
- Transcription repression in oncogenic transformation: common targets of epigenetic repression in cells transformed by Fos, Ras or Dnmt1Oncogene, 2004
- CONREAL: Conserved Regulatory Elements Anchored Alignment Algorithm for Identification of Transcription Factor Binding Sites by Phylogenetic FootprintingGenome Research, 2003
- LAGAN and Multi-LAGAN: Efficient Tools for Large-Scale Multiple Alignment of Genomic DNAGenome Research, 2003
- rVistafor Comparative Sequence-Based Discovery of Functional Transcription Factor Binding SitesGenome Research, 2002
- Human-mouse genome comparisons to locate regulatory sitesNature Genetics, 2000
- BLAST 2 Sequences, a new tool for comparing protein and nucleotide sequencesFEMS Microbiology Letters, 1999
- Identification of regulatory regions which confer muscle-specific gene expressionJournal of Molecular Biology, 1998
- Human and rodent DNA sequence comparisons: a mosaic model of genomic evolutionTrends in Genetics, 1995
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970