Comparative Analysis of Noncoding Regions of 77 Orthologous Mouse and Human Gene Pairs
Open Access
- 1 September 1999
- journal article
- Published by Cold Spring Harbor Laboratory in Genome Research
- Vol. 9 (9) , 815-824
- https://doi.org/10.1101/gr.9.9.815
Abstract
A data set of 77 genomic mouse/human gene pairs has been compiled from the EMBL nucleotide database, and their corresponding features determined. This set was used to analyze the degree of conservation of noncoding sequences between mouse and human. A new alignment algorithm was developed to cope with the fact that large parts of noncoding sequences are not alignable in a meaningful way because of genetic drift. This new algorithm, DNA Block Aligner (DBA), finds colinear-conserved blocks that are flanked by nonconserved sequences of varying lengths. The noncoding regions of the data set were aligned with DBA. The proportion of the noncoding regions covered by blocks >60% identical was 36% for upstream regions, 50% for 5′ UTRs, 23% for introns, and 56% for 3′ UTRs. These blocks of high identity were more or less evenly distributed across the length of the features, except for upstream regions in which the first 100 bp upstream of the transcription start site was covered in up to 70% of the gene pairs. This data set complements earlier sets on the basis of cDNA sequences and will be useful for further comparative studies.[This paper contains supplementary data that can be found at http://www.genome.com.]Keywords
This publication has 27 references indexed in Scilit:
- Identification of common molecular subsequencesPublished by Elsevier ,2004
- CpG Islands in vertebrate genomesPublished by Elsevier ,2004
- Embryonic ε and γ globin genes of a prosimian primate (Galago crassicaudatus): Nucleotide and amino acid sequences, developmental regulation and phylogenetic footprintsPublished by Elsevier ,2004
- Comparative analysis of 1196 orthologous mouse and human full-length mRNA and protein sequences.Genome Research, 1996
- The gene distribution of the human genomeGene, 1996
- A Workbench for large-scale sequence homology analysisBioinformatics, 1994
- Striking sequence similarity over almost 100 kilobases of human and mouse T–cell receptor DNANature Genetics, 1994
- Basic Local Alignment Search ToolJournal of Molecular Biology, 1990
- Basic local alignment search toolJournal of Molecular Biology, 1990
- A general method applicable to the search for similarities in the amino acid sequence of two proteinsJournal of Molecular Biology, 1970