Quality control and preprocessing of metagenomic datasets
Top Cited Papers
Open Access
- 28 January 2011
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 27 (6) , 863-864
- https://doi.org/10.1093/bioinformatics/btr026
Abstract
Summary: Here, we present PRINSEQ for easy and rapid quality control and data preprocessing of genomic and metagenomic datasets. Summary statistics of FASTA (and QUAL) or FASTQ files are generated in tabular and graphical form and sequences can be filtered, reformatted and trimmed by a variety of options to improve downstream analysis. Availability and Implementation: This open-source application was implemented in Perl and can be used as a stand alone version or accessed online through a user-friendly web interface. The source code, user help and additional information are available at http://prinseq.sourceforge.net/. Contact:rschmied@sciences.sdsu.edu; redwards@cs.sdsu.eduKeywords
This publication has 7 references indexed in Scilit:
- SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing dataBMC Bioinformatics, 2010
- TagCleaner: Identification and removal of tag sequences from genomic and metagenomic datasetsBMC Bioinformatics, 2010
- Manipulation of FASTQ data with GalaxyBioinformatics, 2010
- Systematic artifacts in metagenomes from complex microbial communitiesThe ISME Journal, 2009
- Metagenomic signatures of 86 microbial and viral metagenomesEnvironmental Microbiology, 2009
- A Fast and Symmetric DUST Implementation to Mask Low-Complexity DNA SequencesJournal of Computational Biology, 2006
- Over- and under-representation of short oligonucleotides in DNA sequences.Proceedings of the National Academy of Sciences, 1992