Structure of the promoter for chicken alpha 2 type I collagen gene.

Abstract
The chicken .alpha.2 type I collagen gene is 38 kilobases long and its coding information is subdivided into more than 50 exons. Primer extension and S1 nuclease mapping were used to determine the sequence of the 5'' end of .alpha.2 collagen mRNA and to locate the start site for transcription of the .alpha.2 collagen gene. The DNA sequence around the start site for transcription shows a typical Goldberg-Hogness sequence, 5'' T-A-T-A-A-A-T 3'', between -33 and -26 and a 5'' G-C-C-C-A-T-T 3'' sequence (CAT box) between -84 and -78. Three AUG are found in the initial portion of the mRNA, the 1st from +54 to +56, the 2nd from +117 to +119, and the 3rd from +134 to +136. The first 2 AUG are followed by short coding sequences that could specify a hexapeptide and a tetrapeptide, respectively. Only the 3rd AUG is followed by an open reading frame coding for a sequence that presents considerable homology with the previously determined amino acid sequence of prepro .alpha.1 collagen. In the promoter region sequence there are several extensive dyads of symmetry. Three of these inverted repeats which precede the start site for transcription overlap each other and may have a role in the developmental regulation of this gene.