Comparative analysis of 1196 orthologous mouse and human full-length mRNA and protein sequences.
Open Access
- 1 September 1996
- journal article
- research article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 6 (9) , 846-857
- https://doi.org/10.1101/gr.6.9.846
Abstract
A large set of mRNA and encoded protein sequences, from orthologous murine and human genes, was compiled to analyze statistical, biological, and evolutionary properties of coding and noncoding transcribed sequences. Protein sequence conservation varied between 36% and 100% identity, with an average value of 85%. The average degree of nucleotide sequence identity for the corresponding coding sequences was also approximately 85%, whereas 5' and 3' untranslated regions (UTRs) were less conserved, with aligned identities of 67% and 69%, respectively. For some mouse and human genes, nucleotide sequences are more highly conserved than the encoded protein sequences. A subset of 32 sequences, consisting of only mouse/human protein pairs for which the human sequence represents a positionally cloned disease gene, had properties very similar to the larger data set, suggesting that our data are representative of the genome as a whole. With respect to sequence conservation, two interesting outliers are the breast cancer (BRCAI) gene product and the testis-determining factor (SRY), both of which display among the lowest degrees of sequence identity. The occurrence of both introns and repetitive elements (e.g., Alu, Bl) in 5' and 3' UTRs was also studied. These results provide one benchmark for the "comparative genomics" of mice and humans, with practical implications for the cross-referencing of transcript maps. Also, they should prove useful in estimating the additional sampling diversity provided by mouse EST sequencing projects designed to complement the existing human cDNA collection.Keywords
This publication has 38 references indexed in Scilit:
- A local alignment tool for very long DNA sequencesBioinformatics, 1995
- Sequence Homologies and Linkage Group Conservation of the Human and Mouse Cenpc GenesGenomics, 1994
- On global sequence alignmentBioinformatics, 1994
- Striking sequence similarity over almost 100 kilobases of human and mouse T–cell receptor DNANature Genetics, 1994
- Sequence analysis and compositional properties of untranslated regions of human mRNAsGene, 1994
- The human homolog of a candidate mouse t complex responder gene: conserved motifs and evolution with punctuated equilibriaHuman Molecular Genetics, 1993
- Evidence That the SRY Protein Is Encoded by a Single Exon on the Human Y ChromosomeGenomics, 1993
- Cytoplasmic regulation of mRNA function: The importance of the 3′ untranslated regionCell, 1993
- Lymphoproliferation disorder in mice explained by defects in Fas antigen that mediates apoptosisNature, 1992
- Report of the committee on comparative gene mappingCytogenetic and Genome Research, 1991