Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences
Top Cited Papers
- 11 December 2002
- journal article
- research article
- Published by Proceedings of the National Academy of Sciences in Proceedings of the National Academy of Sciences
- Vol. 99 (26) , 16899-16903
- https://doi.org/10.1073/pnas.242603899
Abstract
The National Institutes of Health Mammalian Gene Collection (MGC) Program is a multiinstitutional effort to identify and sequence a cDNA clone containing a complete ORF for each human and mouse gene. ESTs were generated from libraries enriched for full-length cDNAs and analyzed to identify candidate full-ORF clones, which then were sequenced to high accuracy. The MGC has currently sequenced and verified the full ORF for a nonredundant set of >9,000 human and >6,000 mouse genes. Candidate full-ORF clones for an additional 7,800 human and 3,500 mouse genes also have been identified. All MGC sequences and clones are available without restriction through public databases and clone distribution networks (see http://mgc.nci.nih.gov ).Keywords
This publication has 18 references indexed in Scilit:
- The Protein Data BankActa Crystallographica Section D-Biological Crystallography, 2002
- The Protein Information Resource: an integrated public resource of functional annotation of proteinsNucleic Acids Research, 2002
- Computational Inference of Homologous Gene Structures in the Human GenomeGenome Research, 2001
- Evaluation of Gene-Finding Programs on Mammalian SequencesGenome Research, 2001
- The Sequence of the Human GenomeScience, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- Gapped BLAST and PSI-BLAST: a new generation of protein database search programsNucleic Acids Research, 1997
- Simultaneous Shotgun Sequencing of Multiple cDN A ClonesDNA Sequence, 1997
- Normalization and subtraction: two approaches to facilitate gene discovery.Genome Research, 1996
- Sequence identification of 2,375 human brain genesNature, 1992