Gene structure and 5'‐upstream sequence of rat cathepsin L

Abstract
The structure of rat cathepsin L gene has been determined. The gene spans 8.5 kilobase pairs comprising 8 exons, and has an intron located near the active site cysteine residue. The gene structure does not correspond well to the functional units of the proteinase. These characteristics are found to be in common with the cysteine proteinase gene family. In the 5'-upstream region, one CAAT-box and four SP-1 binding sites, together with two AP-2 binding sites and CRE, but no typical TATA-box are found. Further, SP-1 and AP-2 binding sites and an octamer motif are also found in the 1st intron, suggesting a complex regulatory mechanism for the expression of the cathepsin L gene.