The proteome of Toxoplasma gondii: integration with the genome provides novel insights into gene expression and annotation
Open Access
- 21 July 2008
- journal article
- Published by Springer Nature in Genome Biology
- Vol. 9 (7) , R116
- https://doi.org/10.1186/gb-2008-9-7-r116
Abstract
Background: Although the genomes of many of the most important human and animal pathogens have now been sequenced, our understanding of the actual proteins expressed by these genomes and how well they predict protein sequence and expression is still deficient. We have used three complementary approaches (two-dimensional electrophoresis, gel-liquid chromatography linked tandem mass spectrometry and MudPIT) to analyze the proteome of Toxoplasma gondii, a parasite of medical and veterinary significance, and have developed a public repository for these data within ToxoDB, making for the first time proteomics data an integral part of this key genome resource. Results: The draft genome for Toxoplasma predicts around 8,000 genes with varying degrees of confidence. Our data demonstrate how proteomics can inform these predictions and help discover new genes. We have identified nearly one-third (2,252) of all the predicted proteins, with 2,477 intron-spanning peptides providing supporting evidence for correct splice site annotation. Functional predictions for each protein and key pathways were determined from the proteome. Importantly, we show evidence for many proteins that match alternative gene models, or previously unpredicted genes. For example, approximately 15% of peptides matched more convincingly to alternative gene models. We also compared our data with existing transcriptional data in which we highlight apparent discrepancies between gene transcription and protein expression. Conclusion: Our data demonstrate the importance of protein data in expression profiling experiments and highlight the necessity of integrating proteomic with genomic data so that iterative refinements of both annotation and expression models are possible.Keywords
This publication has 52 references indexed in Scilit:
- Determining the protein repertoire of Cryptosporidium parvum sporozoitesProteomics, 2008
- ApiDB: integrated resources for the apicomplexan bioinformatics resource centerNucleic Acids Research, 2006
- Toxoplasma gondii scavenges host-derived lipoic acid despite its de novo synthesis in the apicoplastThe EMBO Journal, 2006
- The FunCat, a functional annotation scheme for systematic classification of proteins from whole genomesNucleic Acids Research, 2004
- Automatic Quality Assessment of Peptide Tandem Mass SpectraBioinformatics, 2004
- Analysis of the Plasmodium falciparum proteome by high-accuracy mass spectrometryNature, 2002
- A proteomic view of the Plasmodium falciparum life cycleNature, 2002
- Predicting transmembrane protein topology with a hidden markov model: application to complete genomes11Edited by F. CohenJournal of Molecular Biology, 2001
- An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein databaseJournal of the American Society for Mass Spectrometry, 1994
- Cleavage of Structural Proteins during the Assembly of the Head of Bacteriophage T4Nature, 1970