Gene recognition in eukaryotic DNA by comparison of genomic sequences
Open Access
- 1 November 2001
- journal article
- research article
- Published by Oxford University Press (OUP) in Bioinformatics
- Vol. 17 (11) , 1011-1018
- https://doi.org/10.1093/bioinformatics/17.11.1011
Abstract
Motivation: Sequencing of complete eukaryotic genomes and large syntenic fragments of genomes makes it possible to apply genomic comparison for gene recognition. Results: This paper describes a spliced alignment algorithm that aligns candidate exon chains of two homologous genomic sequence fragments from different species. The algorithm is implemented in Pro-Gen software. Unlike other algorithms, Pro-Gen does not assume conservation of the exon–intron structure. Amino acid sequences obtained by the formal translation of candidate exons are aligned instead of nucleotide sequences, which allows for distant comparisons. The algorithm was tested on a sample of human–mammal (mouse), human–vertebrate (Xenopus ) and human–invertebrate (Drosophila ) gene pairs. Surprisingly, the best results, 97–98% correlation between the actual and predicted genes, were obtained for more distant comparisons, whereas the correlation on the human–mouse sample was only 93%. The latter value increases to 95% if conservation of the exon–intron structure is assumed. This is caused by a large amount of sequence conservation in non-coding regions of the human and mouse genes probably due to regulatory elements. Availaility: Pro-Gen v. 3.0 is available to academic researchers free of charge at http://www.anchorgen.com/pro_gen/pro_gen.html. Contact: misha@imb.imb.ac.ruKeywords
This publication has 0 references indexed in Scilit: