ESTimating plant phylogeny: lessons from partitioning
Open Access
- 15 June 2006
- journal article
- research article
- Published by Springer Nature in BMC Ecology and Evolution
- Vol. 6 (1) , 48
- https://doi.org/10.1186/1471-2148-6-48
Abstract
Background: While Expressed Sequence Tags (ESTs) have proven a viable and efficient way to sample genomes, particularly those for which whole-genome sequencing is impractical, phylogenetic analysis using ESTs remains difficult. Sequencing errors and orthology determination are the major problems when using ESTs as a source of characters for systematics. Here we develop methods to incorporate EST sequence information in a simultaneous analysis framework to address controversial phylogenetic questions regarding the relationships among the major groups of seed plants. We use an automated, phylogenetically derived approach to orthology determination called OrthologID generate a phylogeny based on 43 process partitions, many of which are derived from ESTs, and examine several measures of support to assess the utility of EST data for phylogenies. Results: A maximum parsimony (MP) analysis resulted in a single tree with relatively high support at all nodes in the tree despite rampant conflict among trees generated from the separate analysis of individual partitions. In a comparison of broader-scale groupings based on cellular compartment (ie: chloroplast, mitochondrial or nuclear) or function, only the nuclear partition tree (based largely on EST data) was found to be topologically identical to the tree based on the simultaneous analysis of all data. Despite topological conflict among the broader-scale groupings examined, only the tree based on morphological data showed statistically significant differences. Conclusion: Based on the amount of character support contributed by EST data which make up a majority of the nuclear data set, and the lack of conflict of the nuclear data set with the simultaneous analysis tree, we conclude that the inclusion of EST data does provide a viable and efficient approach to address phylogenetic questions within a parsimony framework on a genomic scale, if problems of orthology determination and potential sequencing errors can be overcome. In addition, approaches that examine conflict and support in a simultaneous analysis framework allow for a more precise understanding of the evolutionary history of individual process partitions and may be a novel way to understand functional aspects of different kinds of cellular classes of gene products.Keywords
This publication has 90 references indexed in Scilit:
- Comparative Genomics and Disorder Prediction Identify Biologically Relevant SH3 Protein InteractionsPLoS Computational Biology, 2005
- MAFFT version 5: improvement in accuracy of multiple sequence alignmentNucleic Acids Research, 2005
- ANGIOSPERM DIVERGENCE TIMES: THE EFFECT OF GENES, CODON POSITIONS, AND TIME CONSTRAINTSEvolution, 2005
- Apparent homology of expressed genes from wood-forming tissues of loblolly pine ( Pinus taeda L.) with Arabidopsis thalianaProceedings of the National Academy of Sciences, 2003
- Sequencing and comparison of yeast species to identify genes and regulatory elementsNature, 2003
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choiceNucleic Acids Research, 1994
- TESTING SIGNIFICANCE OF INCONGRUENCECladistics, 1994
- Basic local alignment search toolJournal of Molecular Biology, 1990
- Confidence Limits on Phylogenies: An Approach Using the BootstrapEvolution, 1985