Phylogenetic classification of short environmental DNA fragments
Open Access
- 19 February 2008
- journal article
- research article
- Published by Oxford University Press (OUP) in Nucleic Acids Research
- Vol. 36 (7) , 2230-2239
- https://doi.org/10.1093/nar/gkn038
Abstract
Metagenomics is providing striking insights into the ecology of microbial communities. The recently developed massively parallel 454 pyrosequencing technique gives the opportunity to rapidly obtain metagenomic sequences at a low cost and without cloning bias. However, the phylogenetic analysis of the short reads produced represents a significant computational challenge. The phylogenetic algorithm CARMA for predicting the source organisms of environmental 454 reads is described. The algorithm searches for conserved Pfam domain and protein families in the unassembled reads of a sample. These gene fragments (environmental gene tags, EGTs), are classified into a higher-order taxonomy based on the reconstruction of a phylogenetic tree of each matching Pfam family. The method exhibits high accuracy for a wide range of taxonomic groups, and EGTs as short as 27 amino acids can be phylogenetically classified up to the rank of genus. The algorithm was applied in a comparative study of three aquatic microbial samples obtained by 454 pyrosequencing. Profound differences in the taxonomic composition of these samples could be clearly revealed.This publication has 37 references indexed in Scilit:
- Naïve Bayesian Classifier for Rapid Assignment of rRNA Sequences into the New Bacterial TaxonomyApplied and Environmental Microbiology, 2007
- MEGAN analysis of metagenomic dataGenome Research, 2007
- Database resources of the National Center for Biotechnology InformationNucleic Acids Research, 2006
- An obesity-associated gut microbiome with increased capacity for energy harvestNature, 2006
- The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public dataNucleic Acids Research, 2006
- Metagenomic Analysis of the Human Distal Gut MicrobiomeScience, 2006
- Pfam: clans, web tools and servicesNucleic Acids Research, 2006
- Community structure and metabolism through reconstruction of microbial genomes from the environmentNature, 2004
- Basic local alignment search toolJournal of Molecular Biology, 1990