SPIDER: SOFTWARE FOR PROTEIN IDENTIFICATION FROM SEQUENCE TAGS WITH DE NOVO SEQUENCING ERROR
- 1 June 2005
- journal article
- research article
- Published by World Scientific Pub Co Pte Ltd in Journal of Bioinformatics and Computational Biology
- Vol. 03 (03) , 697-716
- https://doi.org/10.1142/s0219720005001247
Abstract
For the identification of novel proteins using MS/MS, de novo sequencing software computes one or several possible amino acid sequences (called sequence tags) for each MS/MS spectrum. Those tags are then used to match, accounting amino acid mutations, the sequences in a protein database. If the de novo sequencing gives correct tags, the homologs of the proteins can be identified by this approach and software such as MS-BLAST is available for the matching. However, de novo sequencing very often gives only partially correct tags. The most common error is that a segment of amino acids is replaced by another segment with approximately the same masses. We developed a new efficient algorithm to match sequence tags with errors to database sequences for the purpose of protein and peptide identification. A software package, SPIDER, was developed and made available on Internet for free public use. This paper describes the algorithms and features of the SPIDER software.Keywords
This publication has 27 references indexed in Scilit:
- Mass spectrometry-based proteomicsNature, 2003
- A Dynamic Programming Approach to De Novo Peptide Sequencing via Tandem Mass SpectrometryJournal of Computational Biology, 2001
- De NovoPeptide Sequencing via Tandem Mass SpectrometryJournal of Computational Biology, 1999
- Automated interpretation of high-energy collision-induced dissociation spectra of singly protonated peptides by ‘seqms', a software aid forde novo sequencing by tandem mass spectrometryRapid Communications in Mass Spectrometry, 1998
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- An approach to correlate tandem mass spectral data of peptides with amino acid sequences in a protein databaseJournal of the American Society for Mass Spectrometry, 1994
- Amino acid substitution matrices from protein blocks.Proceedings of the National Academy of Sciences, 1992
- Pattern-based algorithm for peptide sequencing from tandem high energy collision-induced dissociation mass spectraJournal of the American Society for Mass Spectrometry, 1992
- Basic local alignment search toolJournal of Molecular Biology, 1990
- Fast algorithm for peptide sequencing by mass spectroscopyJournal of Mass Spectrometry, 1990