Use of shotgun proteomics for the identification, confirmation, and correction of C. elegans gene annotations
- 24 July 2008
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 18 (10) , 1660-1669
- https://doi.org/10.1101/gr.077644.108
Abstract
We describe a general mass spectrometry-based approach for gene annotation of any organism and demonstrate its effectiveness using the nematode Caenorhabditis elegans. We detected 6779 C. elegans proteins (67,047 peptides), including 384 that, although annotated in WormBase WS150, lacked cDNA or other prior experimental support. We also identified 429 new coding sequences that were unannotated in WS150. Nearly half (192/429) of the new coding sequences were confirmed with RT-PCR data. Thirty-three (∼8%) of the new coding sequences had been predicted to be pseudogenes, 151 (∼35%) reveal apparent errors in gene models, and 245 (57%) appear to be novel genes. In addition, we verified 6010 exon–exon splice junctions within existing WormBase gene models. Our work confirms that mass spectrometry is a powerful experimental tool for annotating sequenced genomes. In addition, the collection of identified peptides should facilitate future proteomics experiments targeted at specific proteins of interest.Keywords
This publication has 37 references indexed in Scilit:
- Assigning Significance to Peptides Identified by Tandem Mass Spectrometry Using Decoy DatabasesJournal of Proteome Research, 2007
- Whole proteome analysis of post-translational modifications: Applications of mass-spectrometry for proteogenomic annotationGenome Research, 2007
- Computational prediction of proteotypic peptides for quantitative proteomicsNature Biotechnology, 2006
- Correlation of Relative Abundance Ratios Derived from Peptide Ion Chromatograms and Spectrum Counting for Quantitative Proteomic Analysis Using Stable Isotope LabelingAnalytical Chemistry, 2005
- Proteogenomic mapping as a complementary method to perform genome annotationProteomics, 2004
- The Genome Sequence of Caenorhabditis briggsae: A Platform for Comparative GenomicsPLoS Biology, 2003
- Statistical significance for genomewide studiesProceedings of the National Academy of Sciences, 2003
- C. elegans ORFeome version 1.1: experimental verification of the genome annotation and resource for proteome-scale protein expressionNature Genetics, 2003
- daf-2 , an Insulin Receptor-Like Gene That Regulates Longevity and Diapause in Caenorhabditis elegansScience, 1997
- An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein databaseJournal of the American Society for Mass Spectrometry, 1994