TagCleaner: Identification and removal of tag sequences from genomic and metagenomic datasets

Open Access

23 June 2010

journal article
Published by Springer Nature in BMC Bioinformatics

Vol. 11 (1) , 341
https://doi.org/10.1186/1471-2105-11-341

Abstract

Sequencing metagenomes that were pre-amplified with primer-based methods requires the removal of the additional tag sequences from the datasets. The sequenced reads can contain deletions or insertions due to sequencing limitations, and the primer sequence may contain ambiguous bases. Furthermore, the tag sequence may be unavailable or incorrectly reported. Because of the potential for downstream inaccuracies introduced by unwanted sequence contaminations, it is important to use reliable tools for pre-processing sequence data.

Keywords

This publication has 24 references indexed in Scilit:

SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read
BMC Bioinformatics, 2010
The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants
Nucleic Acids Research, 2009
Introducing mothur: Open-Source, Platform-Independent, Community-Supported Software for Describing and Comparing Microbial Communities
Applied and Environmental Microbiology, 2009
Metagenomic Analysis of RNA Viruses in a Fresh Water Lake
PLOS ONE, 2009
De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data
Genome Biology, 2009
Accurate determination of microbial diversity from 454 pyrosequencing data
Nature Methods, 2009
Laboratory procedures to generate viral metagenomes
Nature Protocols, 2009
Direct Metagenomic Detection of Viral Pathogens in Nasal and Fecal Specimens Using an Unbiased High-Throughput Sequencing Approach
PLOS ONE, 2009
The Ribosomal Database Project: improved alignments and new tools for rRNA analysis
Nucleic Acids Research, 2008
Functional metagenomic profiling of nine biomes
Nature, 2008