Computational comparison of two mouse draft genomes and the human golden path
Open Access
- 1 January 2002
- journal article
- research article
- Published by Springer Nature in Genome Biology
- Vol. 4 (1) , 1-10
- https://doi.org/10.1186/gb-2002-4-1-r1
Abstract
The availability of both mouse and human draft genomes has marked the beginning of a new era of comparative mammalian genomics. The two available mouse genome assemblies, from the public mouse genome sequencing consortium and Celera Genomics, were obtained using different clone libraries and different assembly methods. We present here a critical comparison of the two latest mouse genome assemblies. The utility of the combined genomes is further demonstrated by comparing them with the human 'golden path' and through a subsequent analysis of a resulting conserved sequence element (CSE) database, which allows us to identify over 6,000 potential novel genes and to derive independent estimates of the number of human protein-coding genes. The Celera and public mouse assemblies differ in about 10% of the mouse genome. Each assembly has advantages over the other: Celera has higher accuracy in base-pairs and overall higher coverage of the genome; the public assembly, however, has higher sequence quality in some newly finished bacterial artifical chromosome clone (BAC) regions and the data are freely accessible. Perhaps most important, by combining both assemblies, we can get a better annotation of the human genome; in particular, we can obtain the most complete set of CSEs, one third of which are related to known genes and some others are related to other functional genomic regions. More than half the CSEs are of unknown function. From the CSEs, we estimate the total number of human protein-coding genes to be about 40,000. This searchable publicly available online CSEdb will expedite new discoveries through comparative genomics.Keywords
This publication has 12 references indexed in Scilit:
- GFScan: A Gene Family Search Tool at Genomic DNA LevelGenome Research, 2002
- Large-Scale Transcriptional Activity in Chromosomes 21 and 22Science, 2002
- BLAT—The BLAST-Like Alignment ToolGenome Research, 2002
- The Schizosaccharomyces pombe mgU6-47 gene is required for 2'-O-methylation of U6 snRNA at A41Nucleic Acids Research, 2002
- SGP-1: Prediction and Validation of Homologous Genes Based on Sequence AlignmentsGenome Research, 2001
- Integrating genomic homology into gene structure predictionBioinformatics, 2001
- Computational Inference of Homologous Gene Structures in the Human GenomeGenome Research, 2001
- Initial sequencing and analysis of the human genomeNature, 2001
- Identification of a Zinc Finger Protein that Inhibits IL-2 Gene ExpressionScience, 1991
- Basic local alignment search toolJournal of Molecular Biology, 1990