MetaSim—A Sequencing Simulator for Genomics and Metagenomics
Open Access
- 8 October 2008
- journal article
- research article
- Published by Public Library of Science (PLoS) in PLOS ONE
- Vol. 3 (10) , e3373
- https://doi.org/10.1371/journal.pone.0003373
Abstract
The new research field of metagenomics is providing exciting insights into various, previously unclassified ecological systems. Next-generation sequencing technologies are producing a rapid increase of environmental data in public databases. There is great need for specialized software solutions and statistical methods for dealing with complex metagenome data sets. To facilitate the development and improvement of metagenomic tools and the planning of metagenomic projects, we introduce a sequencing simulator called MetaSim. Our software can be used to generate collections of synthetic reads that reflect the diverse taxonomical composition of typical metagenome data sets. Based on a database of given genomes, the program allows the user to design a metagenome by specifying the number of genomes present at different levels of the NCBI taxonomy, and then to collect reads from the metagenome using a simulation of a number of different sequencing technologies. A population sampler optionally produces evolved sequences based on source genomes and a given evolutionary tree. MetaSim allows the user to simulate individual read datasets that can be used as standardized test scenarios for planning sequencing projects or for benchmarking metagenomic software.Keywords
This publication has 39 references indexed in Scilit:
- Phylogenetic classification of short environmental DNA fragmentsNucleic Acids Research, 2008
- Database resources of the National Center for Biotechnology InformationNucleic Acids Research, 2007
- Use of simulated data sets to evaluate the fidelity of metagenomic processing methodsNature Methods, 2007
- The Sorcerer II Global Ocean Sampling Expedition: Northwest Atlantic through Eastern Tropical PacificPLoS Biology, 2007
- MEGAN analysis of metagenomic dataGenome Research, 2007
- An obesity-associated gut microbiome with increased capacity for energy harvestNature, 2006
- Metagenomic Analysis of the Human Distal Gut MicrobiomeScience, 2006
- Community structure and metabolism through reconstruction of microbial genomes from the environmentNature, 2004
- KEGG: Kyoto Encyclopedia of Genes and GenomesNucleic Acids Research, 2000
- Dinucleotide relative abundance extremes: a genomic signatureTrends in Genetics, 1995