Visual and statistical comparison of metagenomes
Open Access
- 10 June 2009
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 25 (15) , 1849-1855
- https://doi.org/10.1093/bioinformatics/btp341
Abstract
Background: Metagenomics is the study of the genomic content of an environmental sample of microbes. Advances in the through-put and cost-efficiency of sequencing technology is fueling a rapid increase in the number and size of metagenomic datasets being generated. Bioinformatics is faced with the problem of how to handle and analyze these datasets in an efficient and useful way. One goal of these metagenomic studies is to get a basic understanding of the microbial world both surrounding us and within us. One major challenge is how to compare multiple datasets. Furthermore, there is a need for bioinformatics tools that can process many large datasets and are easy to use. Results: This article describes two new and helpful techniques for comparing multiple metagenomic datasets. The first is a visualization technique for multiple datasets and the second is a new statistical method for highlighting the differences in a pairwise comparison. We have developed implementations of both methods that are suitable for very large datasets and provide these in Version 3 of our standalone metagenome analysis tool MEGAN. Conclusion: These new methods are suitable for the visual comparison of many large metagenomes and the statistical comparison of two metagenomes at a time. Nevertheless, more work needs to be done to support the comparative analysis of multiple metagenome datasets. Availability: Version 3 of MEGAN, which implements all ideas presented in this article, can be obtained from our web site at: www-ab.informatik.uni-tuebingen.de/software/megan. Contact:mitra@informatik.uni-tuebingen.de Supplementary information: Supplementary data are available at Bioinformatics online.This publication has 28 references indexed in Scilit:
- Signature, a web server for taxonomic characterization of sequence samples using signature genesNucleic Acids Research, 2008
- Phylogenetic classification of short environmental DNA fragmentsNucleic Acids Research, 2008
- TOWARD AN ECOLOGICAL CLASSIFICATION OF SOIL BACTERIAEcology, 2007
- MEGAN analysis of metagenomic dataGenome Research, 2007
- UniFrac – An online tool for comparing microbial community diversity in a phylogenetic contextBMC Bioinformatics, 2006
- Identifying differential expression in multiple SAGE libraries: an overdispersed log-linear model approachBMC Bioinformatics, 2005
- Differential expression in SAGE: accounting for normal between-library variationBioinformatics, 2003
- Genomes OnLine Database (GOLD): a monitor of genome projects world-wideNucleic Acids Research, 2001
- Molecular biological access to the chemistry of unknown soil microbes: a new frontier for natural productsChemistry & Biology, 1998
- Basic local alignment search toolJournal of Molecular Biology, 1990