Identification of Novel Human Genes Evolutionarily Conserved inCaenorhabditis elegansby Comparative Proteomics
Top Cited Papers
Open Access
- 1 May 2000
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 10 (5) , 703-713
- https://doi.org/10.1101/gr.10.5.703
Abstract
Modern biomedical research greatly benefits from large-scale genome-sequencing projects ranging from studies of viruses, bacteria, and yeast to multicellular organisms, likeCaenorhabditis elegans. Comparative genomic studies offer a vast array of prospects for identification and functional annotation of human ortholog genes. We presented a novel comparative proteomic approach for assembling human gene contigs and assisting gene discovery. TheC. elegansproteome was used as an alignment template to assist in novel human gene identification from human EST nucleotide databases. Among the available 18,452C. elegansprotein sequences, our results indicate that at least 83% (15,344 sequences) ofC. elegansproteome has human homologous genes, with 7,954 records ofC. elegansproteins matching known human gene transcripts. Only 11% or less ofC. elegansproteome contains nematode-specific genes. We found that the remaining 7,390 sequences might lead to discoveries of novel human genes, and over 150 putative full-length human gene transcripts were assembled upon further database analyses.[The sequence data described in this paper have been submitted to the GenBank data library under accession nos.AF132936–AF132973,AF151799–AF151909, andAF152097.]Keywords
This publication has 47 references indexed in Scilit:
- Identification and Gene Structure of a Novel Human PLZF-Related Transcription Factor Gene, TZFPBiochemical and Biophysical Research Communications, 1999
- Discovery of Three Novel Orphan G-Protein-Coupled ReceptorsGenomics, 1999
- Genome Sequence of the Nematode C. elegans : A Platform for Investigating BiologyScience, 1998
- Protein-tyrosine kinase and protein-serine/threonine kinase expression in human gastric cancer cell linesJournal of Biomedical Science, 1998
- GeneFIND web server for protein family identification and information retrieval.Bioinformatics, 1998
- Drosophila-related expressed sequencesHuman Molecular Genetics, 1997
- Life with 6000 GenesScience, 1996
- dbEST — database for “expressed sequence tags”Nature Genetics, 1993
- Identification of protein coding regions by database similarity searchNature Genetics, 1993
- The Human Genome Project: Past, Present, and FutureScience, 1990