Computational Inference of Homologous Gene Structures in the Human Genome
Open Access
- 1 May 2001
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 11 (5) , 803-816
- https://doi.org/10.1101/gr.175701
Abstract
With the human genome sequence approaching completion, a major challenge is to identify the locations and encoded protein sequences of all human genes. To address this problem we have developed a new gene identification algorithm, GenomeScan, which combines exon–intron and splice signal models with similarity to known protein sequences in an integrated model. Extensive testing shows thatGenomeScan can accurately identify the exon–intron structures of genes in finished or draft human genome sequence with a low rate of false-positives. Application of GenomeScan to 2.7 billion bases of human genomic DNA identified at least 20,000–25,000 human genes out of an estimated 30,000–40,000 present in the genome. The results show an accurate and efficient automated approach for identifying genes in higher eukaryotic genomes and provide a first-level annotation of the draft human genome.Keywords
This publication has 32 references indexed in Scilit:
- The Sequence of the Human GenomeScience, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- Genie—Gene Finding in Drosophila melanogasterGenome Research, 2000
- The Transcriptional Program in the Response of Human Fibroblasts to SerumScience, 1999
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Computational methods for the identification of genes in vertebrate genomic sequencesHuman Molecular Genetics, 1997
- Genomic Organization of Two Novel Genes on Human Xq28: Compact Head to Head Arrangement ofIDHγ andTRAPδ Is Conserved in Rat and MouseGenomics, 1997
- Prediction of complete gene structures in human genomic DNAJournal of Molecular Biology, 1997
- Identification of protein coding regions by database similarity searchNature Genetics, 1993
- A tutorial on hidden Markov models and selected applications in speech recognitionProceedings of the IEEE, 1989