Reevaluating Human Gene Annotation: A Second-Generation Analysis of Chromosome 22
Open Access
- 30 December 2002
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 13 (1) , 27-36
- https://doi.org/10.1101/gr.695703
Abstract
We report a second-generation gene annotation of human chromosome 22. Using expressed sequence databases, comparative sequence analysis, and experimental verification, we have extended genes, fused previously fragmented structures, and identified new genes. The total length in exons of annotation was increased by 74% over our previously published annotation and includes 546 protein-coding genes and 234 pseudogenes. Thirty-two potential protein-coding annotations are partial copies of other genes, and may represent duplications on an evolutionary path to change or loss of function. We also identified 31 non-protein-coding transcripts, including 16 possible antisense RNAs. By extrapolation, we estimate the human genome contains 29,000–36,000 protein-coding genes, 21,300 pseudogenes, and 1500 antisense RNAs. We suggest that our revised annotation criteria provide a paradigm for future annotation of the human genome.[Supplemental material is available online at www.genome.org. The sequence data from this study have been submitted to GenBank under accession nos. AL009266, AL021682-3,AL021708, AL022729, AL035081-2, AL035364, AL035366, AL035545, AL049654,AL050253-8, AL050345-6, AL079310, AL096779-81, AL096879-81, AL096883,AL096886, AL138578, AL157851, AL159142-3, AL160111-2, AL160131-2,AL160311, AL355092, AL355192, AL355841, AL359401, AL359403, AL365511-5,AL442116, AL449243, AL449244, AL450314, AL589866-7, AL590120,AL590887-8, BU583989–BU585359. The following individuals kindly provided reagents, samples, or unpublished information as indicated in the paper: J. Seilhamer, L. Stuve, H. Roest-Crollius, A. Levine, G. Slater, and J. Kent.]Keywords
This publication has 52 references indexed in Scilit:
- Evidence Suggesting That a Fifth of Annotated Caenorhabditis elegans Genes May Be PseudogenesGenome Research, 2002
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002
- Molecular Fossils in the Human Genome: Identification and Analysis of the Pseudogenes in Chromosomes 21 and 22Genome Research, 2002
- Human-Specific Duplication and Mosaic Transcripts: The Recent Paralogous Structure of Chromosome 22American Journal of Human Genetics, 2002
- The DNA sequence and comparative analysis of human chromosome 20Nature, 2001
- Evolutionarily Conserved Sequences on Human Chromosome 21Genome Research, 2001
- Assessment of the Total Number of Human Transcription UnitsGenomics, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- Prediction of complete gene structures in human genomic DNAJournal of Molecular Biology, 1997
- A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysisGene, 1995