Sequencing platform and library preparation choices impact viral metagenomes
Open Access
- 10 May 2013
- journal article
- research article
- Published by Springer Nature in BMC Genomics
- Vol. 14 (1) , 1-12
- https://doi.org/10.1186/1471-2164-14-320
Abstract
Microbes drive the biogeochemistry that fuels the planet. Microbial viruses modulate their hosts directly through mortality and horizontal gene transfer, and indirectly by re-programming host metabolisms during infection. However, our ability to study these virus-host interactions is limited by methods that are low-throughput and heavily reliant upon the subset of organisms that are in culture. One way forward are culture-independent metagenomic approaches, but these novel methods are rarely rigorously tested, especially for studies of environmental viruses, air microbiomes, extreme environment microbiology and other areas with constrained sample amounts. Here we perform replicated experiments to evaluate Roche 454, Illumina HiSeq, and Ion Torrent PGM sequencing and library preparation protocols on virus metagenomes generated from as little as 10pg of DNA. Using %G + C content to compare metagenomes, we find that (i) metagenomes are highly replicable, (ii) some treatment effects are minimal, e.g., sequencing technology choice has 6-fold less impact than varying input DNA amount, and (iii) when restricted to a limited DNA concentration (<1μg), changing the amount of amplification produces little variation. These trends were also observed when examining the metagenomes for gene function and assembly performance, although the latter more closely aligned to sequencing effort and read length than preparation steps tested. Among Illumina library preparation options, transposon-based libraries diverged from all others and adaptor ligation was a critical step for optimizing sequencing yields. These data guide researchers in generating systematic, comparative datasets to understand complex ecosystems, and suggest that neither varied amplification nor sequencing platforms will deter such efforts.Keywords
This publication has 50 references indexed in Scilit:
- Ultrafast clustering algorithms for metagenomic sequence analysisBriefings in Bioinformatics, 2012
- Towards quantitative metagenomics of wild viruses and other ultra‐low concentration DNA samples: a rigorous assessment and optimization of the linker amplification methodEnvironmental Microbiology, 2012
- Summarizing and correcting the GC content bias in high-throughput sequencingNucleic Acids Research, 2012
- Individual genome assembly from complex community short-read metagenomic datasetsThe ISME Journal, 2011
- A global network of coexisting microbes from environmental and whole-genome sequence dataGenome Research, 2010
- The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variantsNucleic Acids Research, 2009
- New dimensions of the virus world discovered through metagenomicsTrends in Microbiology, 2009
- Amplification-free Illumina sequencing-library preparation facilitates improved mapping and assembly of (G+C)-biased genomesNature Methods, 2009
- A large genome center's improvements to the Illumina sequencing systemNature Methods, 2008
- Substantial biases in ultra-short read data sets from high-throughput DNA sequencingNucleic Acids Research, 2008