Alignment Uncertainty and Genomic Analysis
Top Cited Papers
- 25 January 2008
- journal article
- other
- Published by American Association for the Advancement of Science (AAAS) in Science
- Vol. 319 (5862) , 473-476
- https://doi.org/10.1126/science.1151532
Abstract
The statistical methods applied to the analysis of genomic data do not account for uncertainty in the sequence alignment. Indeed, the alignment is treated as an observation, and all of the subsequent inferences depend on the alignment being correct. This may not have been too problematic for many phylogenetic studies, in which the gene is carefully chosen for, among other things, ease of alignment. However, in a comparative genomics study, the same statistical methods are applied repeatedly on thousands of genes, many of which will be difficult to align. Using genomic data from seven yeast species, we show that uncertainty in the alignment can lead to several problems, including different alignment methods resulting in different conclusions.Keywords
This publication has 23 references indexed in Scilit:
- Incorporating indel information into phylogeny estimation for rapidly emerging pathogensBMC Ecology and Evolution, 2007
- BAli-Phy: simultaneous Bayesian inference of alignment and phylogenyBioinformatics, 2006
- ProbCons: Probabilistic consistency-based multiple sequence alignmentGenome Research, 2005
- MUSCLE: multiple sequence alignment with high accuracy and high throughputNucleic Acids Research, 2004
- Inferring Nonneutral Evolution from Human-Chimp-Mouse Orthologous Gene TriosScience, 2003
- Finding Functional Features in Saccharomyces Genomes by Phylogenetic FootprintingScience, 2003
- Sequencing and comparison of yeast species to identify genes and regulatory elementsNature, 2003
- Surveying Saccharomyces Genomes to Identify Functional Elements by Comparative DNA Sequence AnalysisGenome Research, 2001
- T-coffee: a novel method for fast and accurate multiple sequence alignment 1 1Edited by J. ThorntonJournal of Molecular Biology, 2000
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994