Cloning and sequence analysis of cDNA for human cathepsin D.

Abstract
An 1110-base-pair cDNA clone for human cathepsin D was obtained by screening a .lambda.gt10 human hepatoma G2 cDNA library with a human renin exon 3 genomic fragment. Poly(A)+ RNA blot analysis with this cathepsin D clone demonstrated a message length of .apprx. 2.2 kilobases. The partial clone was used to screen a size-selected human kidney cDNA library, from which 2 cathepsin D recombinant plasmids with inserts of .apprx. 2200 and 2150 base pairs were obtained. The nucleotide sequences of these clones and of the .lambda.gt10 clone were determined. The amino acid sequence predicted from the cDNA sequence shows that human cathepsin D consists of 412 amino acids with 20 and 44 amino acids in a pre- and a prosegment, respectively. The mature protein region shows 87% amino acid identity with porcine cathepsin D but differs in having 9 additional amino acids. Two of these are at the COOH terminus; the other 7 are positioned between the previously determined junction for the L and H chains of porcine cathepsin D. A high degree of sequence homology was observed between human cathepsin D and other aspartyl proteases, suggesting a conservation of 3-dimensional structure in this family of proteins.