The JCVI standard operating procedure for annotating prokaryotic metagenomic shotgun sequencing data
Open Access
- 30 March 2010
- journal article
- research article
- Published by Springer Nature in Standards in Genomic Sciences
- Vol. 2 (2) , 229-237
- https://doi.org/10.4056/sigs.651139
Abstract
The JCVI metagenomics analysis pipeline provides for the efficient and consistent annotation of shotgun metagenomics sequencing data for sampling communities of prokaryotic organisms. The process can be equally applied to individual sequence reads from traditional Sanger capillary electrophoresis sequences, newer technologies such as 454 pyrosequencing, or sequence assemblies derived from one or more of these data types. It includes the analysis of both coding and non-coding genes, whether full-length or, as is often the case for shotgun metagenomics, fragmentary. The system is designed to provide the best-supported conservative functional annotation based on a combination of trusted homology-based scientific evidence and computational assertions and an annotation value hierarchy established through extensive manual curation. The functional annotation attributes assigned by this system include gene name, gene symbol, GO terms [1], EC numbers [2], and JCVI functional role categories [3]. doi:10.4056/sigs.651139Keywords
This publication has 28 references indexed in Scilit:
- The comprehensive microbial resourceNucleic Acids Research, 2009
- MetaGeneAnnotator: Detecting Species-Specific Patterns of Ribosomal Binding Site for Precise Gene Prediction in Anonymous Prokaryotic and Phage GenomesDNA Research, 2008
- The metagenomics RAST server – a public resource for the automatic phylogenetic and functional analysis of metagenomesBMC Bioinformatics, 2008
- Gene identification and protein classification in microbial metagenomic sequence data via incremental clusteringBMC Bioinformatics, 2008
- Environmental Shotgun Sequencing: Its Potential and Challenges for Studying the Hidden World of MicrobesPLoS Biology, 2007
- The Sorcerer II Global Ocean Sampling Expedition: Expanding the Universe of Protein FamiliesPLoS Biology, 2007
- MEGAN analysis of metagenomic dataGenome Research, 2007
- Bioinformatics for Whole-Genome Shotgun Sequencing of Microbial CommunitiesPLoS Computational Biology, 2005
- The Pfam protein families databaseNucleic Acids Research, 2004
- tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic SequenceNucleic Acids Research, 1997