Phylogenomics of Eukaryotes: Impact of Missing Data on Large Alignments
Top Cited Papers
- 1 September 2004
- journal article
- research article
- Published by Oxford University Press (OUP) in Molecular Biology and Evolution
- Vol. 21 (9) , 1740-1752
- https://doi.org/10.1093/molbev/msh182
Abstract
Resolving the relationships between Metazoa and other eukaryotic groups as well as between metazoan phyla is central to the understanding of the origin and evolution of animals. The current view is based on limited data sets, either a single gene with many species (e.g., ribosomal RNA) or many genes but with only a few species. Because a reliable phylogenetic inference simultaneously requires numerous genes and numerous species, we assembled a very large data set containing 129 orthologous proteins ( approximately 30,000 aligned amino acid positions) for 36 eukaryotic species. Included in the alignments are data from the choanoflagellate Monosiga ovata, obtained through the sequencing of about 1,000 cDNAs. We provide conclusive support for choanoflagellates as the closest relative of animals and for fungi as the second closest. The monophyly of Plantae and chromalveolates was recovered but without strong statistical support. Within animals, in contrast to the monophyly of Coelomata observed in several recent large-scale analyses, we recovered a paraphyletic Coelamata, with nematodes and platyhelminths nested within. To include a diverse sample of organisms, data from EST projects were used for several species, resulting in a large amount of missing data in our alignment (about 25%). By using different approaches, we verify that the inferred phylogeny is not sensitive to these missing data. Therefore, this large data set provides a reliable phylogenetic framework for studying eukaryotic and animal evolution and will be easily extendable when large amounts of sequence information become available from a broader taxonomic range.Keywords
This publication has 98 references indexed in Scilit:
- Examining Basal Avian Divergences with Mitochondrial Sequences: Model Complexity, Taxon Sampling, and Sequence LengthSystematic Biology, 2002
- Faculty of 1000 evaluation for The evolutionary position of nematodes.BMC Ecology and Evolution, 2002
- The analysis of 100 genes supports the grouping of three highly divergent amoebae: Dictyostelium , Entamoeba , and MastigamoebaProceedings of the National Academy of Sciences, 2002
- The Phylogenetic Position of the Pelobiont Mastigamoeba balamuthi Based on Sequences of rDNA and Translation Elongation Factors EF‐1α and EF‐2The Journal of Eukaryotic Microbiology, 2002
- Phylogenetic Position of Blastocystis hominis and of Stramenopiles Inferred from Multiple Molecular Sequence DataThe Journal of Eukaryotic Microbiology, 2002
- The Phylogenetic Trunk: Maximal Inclusion of Taxa with Missing Data in an Analysis of the Lepospondyli (Vertebrata, Tetrapoda)Systematic Biology, 2001
- A Kingdom-Level Phylogeny of Eukaryotes Based on Combined Protein DataScience, 2000
- The new animal phylogeny: Reliability and implicationsProceedings of the National Academy of Sciences, 2000
- Evidence for a clade of nematodes, arthropods and other moulting animalsNature, 1997
- Animals and fungi are each other's closest relatives: congruent evidence from multiple proteins.Proceedings of the National Academy of Sciences, 1993