Alignment-independent bilinear multivariate modelling (AIBIMM) for global analyses of 16S rRNA gene phylogeny
- 1 July 2006
- journal article
- research article
- Published by Microbiology Society in International Journal of Systematic and Evolutionary Microbiology
- Vol. 56 (7) , 1565-1575
- https://doi.org/10.1099/ijs.0.63936-0
Abstract
Alignment-independent phylogenetic methods have interesting properties for global phylogenetic reconstructions, particularly with respect to speed and accuracy. Here, we present a novel multimer-based alignment-independent bilinear mathematical modelling (AIBIMM) approach for global 16S rRNA gene phylogenetic analyses. In AIBIMM, jackknife cross-validated principal component analyses (PCA) are used to explain the variance in nucleotide n-mer frequency data. We compared AIBIMM with alignment-based distance, maximum-parsimony and maximum-likelihood phylogenetic methods, analysing taxa belonging to the Proteobacteria (n=82), Actinobacteria (n=30) and Archaea (n=7). These analyses indicated an attraction between the Actinobacteria and Archaea for the traditional methods, with the two taxa Acidimicrobium and Rubrobacter at the root of the tree. AIBIMM, on the other hand, showed that the Actinobacteria was tightly clustered, with Acidimicrobium and Rubrobacter within a distinct subgroup of the Actinobacteria. The application of AIBIMM was further evaluated, analysing full-length 16S rRNA gene sequences for 2818 taxa representing the prokaryotic domains. We obtained a highly structured description of the prokaryote diversity. Sample-to-model (Si) distances were also determined for taxa included in our work. We determined Si distances for models of the six major subgroups of taxa detected in the global analyses, in addition to nested subgroups within the Alphaproteobacteria. The Si-distance evaluation showed a very good separation of the taxa within the models from those outside. We conclude that AIBIMM represents a novel phylogenetic framework suitable for accommodating the current exponential growth of 16S rRNA gene sequences in the public domain.Keywords
This publication has 20 references indexed in Scilit:
- Performance of maximum parsimony and likelihood phylogenetics when evolution is heterogeneousNature, 2004
- Fine-scale phylogenetic architecture of a complex bacterial communityNature, 2004
- Prokaryotic diversity and its limits: microbial community structure in nature and implications for microbial ecologyCurrent Opinion in Microbiology, 2004
- Phylogeny estimation: traditional and Bayesian approachesNature Reviews Genetics, 2003
- A non-hyperthermophilic ancestor for BacteriaNature, 2002
- Bayesian Inference of Phylogeny and Its Impact on Evolutionary BiologyScience, 2001
- MEGA2: molecular evolutionary genetics analysis softwareBioinformatics, 2001
- Signal, Noise, and Reliability in Molecular Phylogenetic AnalysesJournal of Heredity, 1992
- Evolutionary trees from DNA sequences: A maximum likelihood approachJournal of Molecular Evolution, 1981
- Studies in crop variation. II. The manurial response of different potato varietiesThe Journal of Agricultural Science, 1923