PROTOGENE: turning amino acid alignments into bona fide CDS nucleotide alignments

Open Access

1 July 2006

journal article
research article
Published by Oxford University Press (OUP) in Nucleic Acids Research

Vol. 34 (Web Server) , W600-W603
https://doi.org/10.1093/nar/gkl170

Abstract

We describe Protogene, a server that can turn a protein multiple sequence alignment into the equivalent alignment of the original gene coding DNA. Protogene relies on a pipeline where every initial protein sequence is BLASTed against RefSeq or NR. The annotation associated with potential matches is used to identify the gene sequence. This gene sequence is then aligned with the query protein using Exonerate in order to extract a coding nucleotide sequence matching the original protein. Protogene can handle protein fragments and will return every CDS coding for a given protein, even if they occur in different genomes. Protogene is available from http://www.tcoffee.org/ .

Keywords

This publication has 9 references indexed in Scilit:

Database resources of the National Center for Biotechnology Information
Nucleic Acids Research, 2006
SMART 5: domains in the context of genomes and networks
Nucleic Acids Research, 2006
Pfam: clans, web tools and services
Nucleic Acids Research, 2006
Multiple sequence alignments of partially coding nucleic acid sequences
BMC Bioinformatics, 2005
transAlign: using amino acids to facilitate the multiple alignment of protein-coding DNA sequences
BMC Bioinformatics, 2005
Automated generation of heuristics for biological sequence comparison
BMC Bioinformatics, 2005
RevTrans: multiple alignment of coding DNA from aligned amino acid sequences
Nucleic Acids Research, 2003
Evidence for a High Frequency of Simultaneous Double-Nucleotide Substitutions
Science, 2000
Basic local alignment search tool
Journal of Molecular Biology, 1990